A new way to see the activity inside a living cell

- Posted in Uncategorized by

Living cells are bombarded with many kinds of incoming molecular signal that influence their behavior. Being able to measure those signals and how cells respond to them through downstream molecular signaling networks could help scientists learn much more about how cells work, including what happens as they age or become diseased.

Right now, this kind of comprehensive study is not possible because current techniques for imaging cells are limited to just a handful of different molecule types within a cell at one time. However, MIT researchers have developed an alternative method that allows them to observe up to seven different molecules at a time, and potentially even more than that.

“There are many examples in biology where an event triggers a long downstream cascade of events, which then causes a specific cellular function,” says Edward Boyden, the Y. Eva Tan Professor in Neurotechnology. “How does that occur? It’s arguably one of the fundamental problems of biology, and so we wondered, could you simply watch it happen?”

The new approach makes use of green or red fluorescent molecules that flicker on and off at different rates. By imaging a cell over several seconds, minutes, or hours, and then extracting each of the fluorescent signals using a computational algorithm, the amount of each target protein can be tracked as it changes over time.

Boyden, who is also a professor of biological engineering and of brain and cognitive sciences at MIT, a Howard Hughes Medical Institute investigator, and a member of MIT’s McGovern Institute for Brain Research and Koch Institute for Integrative Cancer Research, as well as the co-director of the K. Lisa Yang Center for Bionics, is the senior author of the study, which appears today in Cell. MIT postdoc Yong Qian is the lead author of the paper.

Fluorescent signals

Labeling molecules inside cells with fluorescent proteins has allowed researchers to learn a great deal about the functions of many cellular molecules. This type of study is often done with green fluorescent protein (GFP), which was first deployed for imaging in the 1990s. Since then, several fluorescent proteins that glow in other colors have been developed for experimental use.

However, a typical light microscope can only distinguish two or three of these colors, allowing researchers only a tiny glimpse of the overall activity that is happening inside a cell. If they could track a greater number of labeled molecules, researchers could measure a brain cell’s response to different neurotransmitters during learning, for example, or investigate the signals that prompt a cancer cell to metastasize.

“Ideally, you would be able to watch the signals in a cell as they fluctuate in real time, and then you could understand how they relate to each other. That would tell you how the cell computes,” Boyden says. “The problem is that you can’t watch very many things at the same time.”

In 2020, Boyden’s lab developed a way to simultaneously image up to five different molecules within a cell, by targeting glowing reporters to distinct locations inside the cell. This approach, known as “spatial multiplexing,” allows researchers to distinguish signals for different molecules even though they may all be fluorescing the same color.

In the new study, the researchers took a different approach: Instead of distinguishing signals based on their physical location, they created fluorescent signals that vary over time. The technique relies on “switchable fluorophores” — fluorescent proteins that turn on and off at a specific rate. For this study, Boyden and his group members identified four green switchable fluorophores, and then engineered two more, all of which turn on and off at different rates. They also identified two red fluorescent proteins that switch at different rates, and engineered one additional red fluorophore.

Each of these switchable fluorophores can be used to label a different type of molecule within a living cell, such an enzyme, signaling protein, or part of the cell cytoskeleton. After imaging the cell for several minutes, hours, or even days, the researchers use a computational algorithm to pick out the specific signal from each fluorophore, analogous to how the human ear can pick out different frequencies of sound.

“In a symphony orchestra, you have high-pitched instruments, like the flute, and low-pitched instruments, like a tuba. And in the middle are instruments like the trumpet. They all have different sounds, and our ear sorts them out,” Boyden says.

The mathematical technique that the researchers used to analyze the fluorophore signals is known as linear unmixing. This method can extract different fluorophore signals, similar to how the human ear uses a mathematical model known as a Fourier transform to extract different pitches from a piece of music.

Once this analysis is complete, the researchers can see when and where each of the fluorescently labeled molecules were found in the cell during the entire imaging period. The imaging itself can be done with a simple light microscope, with no specialized equipment required.

Biological phenomena

In this study, the researchers demonstrated their approach by labeling six different molecules involved in the cell division cycle, in mammalian cells. This allowed them to identify patterns in how the levels of enzymes called cyclin-dependent kinases change as a cell progresses through the cell cycle.

The researchers also showed that they could label other types of kinases, which are involved in nearly every aspect of cell signaling, as well as cell structures and organelles such as the cytoskeleton and mitochondria. In addition to their experiments using mammalian cells grown in a lab dish, the researchers showed that this technique could work in the brains of zebrafish larvae.

This method could be useful for observing how cells respond to any kind of input, such as nutrients, immune system factors, hormones, or neurotransmitters, according to the researchers. It could also be used to study how cells respond to changes in gene expression or genetic mutations. All of these factors play important roles in biological phenomena such as growth, aging, cancer, neurodegeneration, and memory formation.

“You could consider all of these phenomena to represent a general class of biological problem, where some short-term event — like eating a nutrient, learning something, or getting an infection — generates a long-term change,” Boyden says.

In addition to pursuing those types of studies, Boyden’s lab is also working on expanding the repertoire of switchable fluorophores so that they can study even more signals within a cell. They also hope to adapt the system so that it could be used in mouse models.

The research was funded by an Alana Fellowship, K. Lisa Yang, John Doerr, Jed McCaleb, James Fickel, Ashar Aziz, the K. Lisa Yang and Hock E. Tan Center for Molecular Therapeutics at MIT, the Howard Hughes Medical Institute, and the National Institutes of Health.

Source: A new way to see the activity inside a living cell

Team engineers nanoparticles using ion irradiation to advance clean energy and fuel conversion

- Posted in Uncategorized by

MIT researchers and colleagues have demonstrated a way to precisely control the size, composition, and other properties of nanoparticles key to the reactions involved in a variety of clean energy and environmental technologies. They did so by leveraging ion irradiation, a technique in which beams of charged particles bombard a material.

They went on to show that nanoparticles created this way have superior performance over their conventionally made counterparts.

“The materials we have worked on could advance several technologies, from fuel cells to generate CO2-free electricity to the production of clean hydrogen feedstocks for the chemical industry [through electrolysis cells],” says Bilge Yildiz, leader of the work and a professor in MIT’s departments of Nuclear Science and Engineering and Materials Science and Engineering.

Critical catalyst

Fuel and electrolysis cells both involve electrochemical reactions through three principal parts: two electrodes (a cathode and anode) separated by an electrolyte. The difference between the two cells is that the reactions involved run in reverse.

The electrodes are coated with catalysts, or materials that make the reactions involved go faster. But a critical catalyst made of metal-oxide materials has been limited by challenges including low durability. “The metal catalyst particles coarsen at high temperatures, and you lose surface area and activity as a result,” says Yildiz, who is also affiliated with the Materials Research Laboratory and is an author of an open-access paper on the work published in the journal Energy & Environmental Science.

Enter metal exsolution, which involves precipitating metal nanoparticles out of a host oxide onto the surface of the electrode. The particles embed themselves into the electrode, “and that anchoring makes them more stable,” says Yildiz. As a result, exsolution has “led to remarkable progress in clean energy conversion and energy-efficient computing devices,” the researchers write in their paper.

However, controlling the precise properties of the resulting nanoparticles has been difficult. “We know that exsolution can give us stable and active nanoparticles, but the challenging part is really to control it. The novelty of this work is that we’ve found a tool — ion irradiation — that can give us that control,” says Jiayue Wang PhD ’22, first author of the paper. Wang, who conducted the work while earning his PhD in the MIT Department of Nuclear Science and Engineering, is now a postdoc at Stanford University.

Sossina Haile ’86, PhD ’92, the Walter P. Murphy Professor of Materials Science and Engineering at Northwestern University, who was not involved in the current work, says:

“Metallic nanoparticles serve as catalysts in a whole host of reactions, including the important reaction of splitting water to generate hydrogen for energy storage. In this work, Yildiz and colleagues have created an ingenious method for controlling the way that nanoparticles form.”

Haile continues, “the community has shown that exsolution results in structurally stable nanoparticles, but the process is not easy to control, so one doesn’t necessarily get the optimal number and size of particles. Using ion irradiation, this group was able to precisely control the features of the nanoparticles, resulting in excellent catalytic activity for water splitting.”

What they did

The researchers found that aiming a beam of ions at the electrode while simultaneously exsolving metal nanoparticles onto the electrode’s surface allowed them to control several properties of the resulting nanoparticles.

“Through ion-matter interactions, we have successfully engineered the size, composition, density, and location of the exsolved nanoparticles,” the team writes in Energy & Environmental Science.

For example, they could make the particles much smaller — down to 2 billionths of a meter in diameter — than those made using conventional thermal exsolution methods alone. Further, they were able to change the composition of the nanoparticles by irradiating with specific elements. They demonstrated this with a beam of nickel ions that implanted nickel into the exsolved metal nanoparticle. As a result, they demonstrated a direct and convenient way to engineer the composition of exsolved nanoparticles.

“We want to have multi-element nanoparticles, or alloys, because they usually have higher catalytic activity,” Yildiz says. “With our approach, the exsolution target does not have to be dependent on the substrate oxide itself.” Irradiation opens the door to many more compositions. “We can pretty much choose any oxide and any ion that we can irradiate with and exsolve that,” says Yildiz.

The team also found that ion irradiation forms defects in the electrode itself. And these defects provide additional nucleation sites, or places for the exsolved nanoparticles to grow from, increasing the density of the resulting nanoparticles.

Irradiation could also allow extreme spatial control over the nanoparticles. “Because you can focus the ion beam, you can imagine that you could ‘write’ with it to form specific nanostructures,” says Wang. “We did a preliminary demonstration [of that], but we believe it has potential to realize well-controlled micro- and nano-structures.”

The team also showed that the nanoparticles they created with ion irradiation had superior catalytic activity over those created by conventional thermal exsolution alone.

Additional MIT authors of the paper are Kevin B. Woller, a principal research scientist at the Plasma Science and Fusion Center (PSFC), home to the equipment used for ion irradiation; Abinash Kumar PhD ’22, who received his PhD from the Department of Materials Science and Engineering (DMSE) and is now at Oak Ridge National Laboratory; and James M. LeBeau, an associate professor in DMSE. Other authors are Zhan Zhang and Hua Zhou of Argonne National Laboratory, and Iradwikanari Waluyo and Adrian Hunt of Brookhaven National Laboratory.

This work was funded by the OxEon Corp. and MIT’s PSFC. The research also used resources supported by the U.S. Department of Energy Office of Science, MIT’s Materials Research Laboratory, and MIT.nano. The work was performed, in part, at Harvard University through a network funded by the National Science Foundation.

Source: Team engineers nanoparticles using ion irradiation to advance clean energy and fuel conversion

New method uses crowdsourced feedback to help train robots

- Posted in Uncategorized by

To teach an AI agent a new task, like how to open a kitchen cabinet, researchers often use reinforcement learning — a trial-and-error process where the agent is rewarded for taking actions that get it closer to the goal.

In many instances, a human expert must carefully design a reward function, which is an incentive mechanism that gives the agent motivation to explore. The human expert must iteratively update that reward function as the agent explores and tries different actions. This can be time-consuming, inefficient, and difficult to scale up, especially when the task is complex and involves many steps.

Researchers from MIT, Harvard University, and the University of Washington have developed a new reinforcement learning approach that doesn’t rely on an expertly designed reward function. Instead, it leverages crowdsourced feedback, gathered from many nonexpert users, to guide the agent as it learns to reach its goal.

While some other methods also attempt to utilize nonexpert feedback, this new approach enables the AI agent to learn more quickly, despite the fact that data crowdsourced from users are often full of errors. These noisy data might cause other methods to fail.

In addition, this new approach allows feedback to be gathered asynchronously, so nonexpert users around the world can contribute to teaching the agent.

“One of the most time-consuming and challenging parts in designing a robotic agent today is engineering the reward function. Today reward functions are designed by expert researchers — a paradigm that is not scalable if we want to teach our robots many different tasks. Our work proposes a way to scale robot learning by crowdsourcing the design of reward function and by making it possible for nonexperts to provide useful feedback,” says Pulkit Agrawal, an assistant professor in the MIT Department of Electrical Engineering and Computer Science (EECS) who leads the Improbable AI Lab in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).

In the future, this method could help a robot learn to perform specific tasks in a user’s home quickly, without the owner needing to show the robot physical examples of each task. The robot could explore on its own, with crowdsourced nonexpert feedback guiding its exploration.

“In our method, the reward function guides the agent to what it should explore, instead of telling it exactly what it should do to complete the task. So, even if the human supervision is somewhat inaccurate and noisy, the agent is still able to explore, which helps it learn much better,” explains lead author Marcel Torne ’23, a research assistant in the Improbable AI Lab.

Torne is joined on the paper by his MIT advisor, Agrawal; senior author Abhishek Gupta, assistant professor at the University of Washington; as well as others at the University of Washington and MIT. The research will be presented at the Conference on Neural Information Processing Systems next month.

Noisy feedback

One way to gather user feedback for reinforcement learning is to show a user two photos of states achieved by the agent, and then ask that user which state is closer to a goal. For instance, perhaps a robot’s goal is to open a kitchen cabinet. One image might show that the robot opened the cabinet, while the second might show that it opened the microwave. A user would pick the photo of the “better” state.

Some previous approaches try to use this crowdsourced, binary feedback to optimize a reward function that the agent would use to learn the task. However, because nonexperts are likely to make mistakes, the reward function can become very noisy, so the agent might get stuck and never reach its goal.

“Basically, the agent would take the reward function too seriously. It would try to match the reward function perfectly. So, instead of directly optimizing over the reward function, we just use it to tell the robot which areas it should be exploring,” Torne says.

He and his collaborators decoupled the process into two separate parts, each directed by its own algorithm. They call their new reinforcement learning method HuGE (Human Guided Exploration).

On one side, a goal selector algorithm is continuously updated with crowdsourced human feedback. The feedback is not used as a reward function, but rather to guide the agent’s exploration. In a sense, the nonexpert users drop breadcrumbs that incrementally lead the agent toward its goal.

On the other side, the agent explores on its own, in a self-supervised manner guided by the goal selector. It collects images or videos of actions that it tries, which are then sent to humans and used to update the goal selector.

This narrows down the area for the agent to explore, leading it to more promising areas that are closer to its goal. But if there is no feedback, or if feedback takes a while to arrive, the agent will keep learning on its own, albeit in a slower manner. This enables feedback to be gathered infrequently and asynchronously.

“The exploration loop can keep going autonomously, because it is just going to explore and learn new things. And then when you get some better signal, it is going to explore in more concrete ways. You can just keep them turning at their own pace,” adds Torne.

And because the feedback is just gently guiding the agent’s behavior, it will eventually learn to complete the task even if users provide incorrect answers.

Faster learning

The researchers tested this method on a number of simulated and real-world tasks. In simulation, they used HuGE to effectively learn tasks with long sequences of actions, such as stacking blocks in a particular order or navigating a large maze.

In real-world tests, they utilized HuGE to train robotic arms to draw the letter “U” and pick and place objects. For these tests, they crowdsourced data from 109 nonexpert users in 13 different countries spanning three continents.

In real-world and simulated experiments, HuGE helped agents learn to achieve the goal faster than other methods.

The researchers also found that data crowdsourced from nonexperts yielded better performance than synthetic data, which were produced and labeled by the researchers. For nonexpert users, labeling 30 images or videos took fewer than two minutes.

“This makes it very promising in terms of being able to scale up this method,” Torne adds.

In a related paper, which the researchers presented at the recent Conference on Robot Learning, they enhanced HuGE so an AI agent can learn to perform the task, and then autonomously reset the environment to continue learning. For instance, if the agent learns to open a cabinet, the method also guides the agent to close the cabinet.

“Now we can have it learn completely autonomously without needing human resets,” he says.

The researchers also emphasize that, in this and other learning approaches, it is critical to ensure that AI agents are aligned with human values.

In the future, they want to continue refining HuGE so the agent can learn from other forms of communication, such as natural language and physical interactions with the robot. They are also interested in applying this method to teach multiple agents at once.

This research is funded, in part, by the MIT-IBM Watson AI Lab.

Source: New method uses crowdsourced feedback to help train robots

Search algorithm reveals nearly 200 new kinds of CRISPR systems

- Posted in Uncategorized by

Microbial sequence databases contain a wealth of information about enzymes and other molecules that could be adapted for biotechnology. But these databases have grown so large in recent years that they’ve become difficult to search efficiently for enzymes of interest.

Now, scientists at the McGovern Institute for Brain Research at MIT, the Broad Institute of MIT and Harvard, and the National Center for Biotechnology Information (NCBI) at the National Institutes of Health have developed a new search algorithm that has identified 188 kinds of new rare CRISPR systems in bacterial genomes, encompassing thousands of individual systems. The work appears today in Science.

The algorithm, which comes from the lab of pioneering CRISPR researcher Professor Feng Zhang, uses big-data clustering approaches to rapidly search massive amounts of genomic data. The team used their algorithm, called Fast Locality-Sensitive Hashing-based clustering (FLSHclust) to mine three major public databases that contain data from a wide range of unusual bacteria, including ones found in coal mines, breweries, Antarctic lakes, and dog saliva. The scientists found a surprising number and diversity of CRISPR systems, including ones that could make edits to DNA in human cells, others that can target RNA, and many with a variety of other functions.

The new systems could potentially be harnessed to edit mammalian cells with fewer off-target effects than current Cas9 systems. They could also one day be used as diagnostics or serve as molecular records of activity inside cells.

The researchers say their search highlights an unprecedented level of diversity and flexibility of CRISPR and that there are likely many more rare systems yet to be discovered as databases continue to grow.

“Biodiversity is such a treasure trove, and as we continue to sequence more genomes and metagenomic samples, there is a growing need for better tools, like FLSHclust, to search that sequence space to find the molecular gems,” says Zhang, a co-senior author on the study and the James and Patricia Poitras Professor of Neuroscience at MIT with joint appointments in the departments of Brain and Cognitive Sciences and Biological Engineering. Zhang is also an investigator at the McGovern Institute for Brain Research at MIT, a core institute member at the Broad, and an investigator at the Howard Hughes Medical Institute. Eugene Koonin, a distinguished investigator at the NCBI, is co-senior author on the study as well.

Searching for CRISPR

CRISPR, which stands for clustered regularly interspaced short palindromic repeats, is a bacterial defense system that has been engineered into many tools for genome editing and diagnostics.

To mine databases of protein and nucleic acid sequences for novel CRISPR systems, the researchers developed an algorithm based on an approach borrowed from the big data community. This technique, called locality-sensitive hashing, clusters together objects that are similar but not exactly identical. Using this approach allowed the team to probe billions of protein and DNA sequences — from the NCBI, its Whole Genome Shotgun database, and the Joint Genome Institute — in weeks, whereas previous methods that look for identical objects would have taken months. They designed their algorithm to look for genes associated with CRISPR.

“This new algorithm allows us to parse through data in a time frame that’s short enough that we can actually recover results and make biological hypotheses,” says Soumya Kannan PhD ’23, who is a co-first author on the study. Kannan was a graduate student in Zhang’s lab when the study began and is currently a postdoc and Junior Fellow at Harvard University. Han Altae-Tran PhD ’23, a graduate student in Zhang’s lab during the study and currently a postdoc at the University of Washington, was the study’s other co-first author.

“This is a testament to what you can do when you improve on the methods for exploration and use as much data as possible,” says Altae-Tran. “It’s really exciting to be able to improve the scale at which we search.”

New systems

In their analysis, Altae-Tran, Kannan, and their colleagues noticed that the thousands of CRISPR systems they found fell into a few existing and many new categories. They studied several of the new systems in greater detail in the lab.

They found several new variants of known Type I CRISPR systems, which use a guide RNA that is 32 base pairs long rather than the 20-nucleotide guide of Cas9. Because of their longer guide RNAs, these Type I systems could potentially be used to develop more precise gene-editing technology that is less prone to off-target editing. Zhang’s team showed that two of these systems could make short edits in the DNA of human cells. And because these Type I systems are similar in size to CRISPR-Cas9, they could likely be delivered to cells in animals or humans using the same gene-delivery technologies being used today for CRISPR.

One of the Type I systems also showed “collateral activity” — broad degradation of nucleic acids after the CRISPR protein binds its target. Scientists have used similar systems to make infectious disease diagnostics such as SHERLOCK, a tool capable of rapidly sensing a single molecule of DNA or RNA. Zhang’s team thinks the new systems could be adapted for diagnostic technologies as well.

The researchers also uncovered new mechanisms of action for some Type IV CRISPR systems, and a Type VII system that precisely targets RNA, which could potentially be used in RNA editing. Other systems could potentially be used as recording tools — a molecular document of when a gene was expressed — or as sensors of specific activity in a living cell.

Mining data

The scientists say their algorithm could aid in the search for other biochemical systems. “This search algorithm could be used by anyone who wants to work with these large databases for studying how proteins evolve or discovering new genes,” Altae-Tran says.

The researchers add that their findings illustrate not only how diverse CRISPR systems are, but also that most are rare and only found in unusual bacteria. “Some of these microbial systems were exclusively found in water from coal mines,” Kannan says. “If someone hadn’t been interested in that, we may never have seen those systems. Broadening our sampling diversity is really important to continue expanding the diversity of what we can discover.”

This work was supported by the Howard Hughes Medical Institute; the K. Lisa Yang and Hock E. Tan Molecular Therapeutics Center at MIT; Broad Institute Programmable Therapeutics Gift Donors; The Pershing Square Foundation, William Ackman and Neri Oxman; James and Patricia Poitras; BT Charitable Foundation; Asness Family Foundation; Kenneth C. Griffin; the Phillips family; David Cheng; and Robert Metcalfe.

Source: Search algorithm reveals nearly 200 new kinds of CRISPR systems

Synthetic imagery sets new bar in AI training efficiency

- Posted in Uncategorized by

Data is the new soil, and in this fertile new ground, MIT researchers are planting more than just pixels. By using synthetic images to train machine learning models, a team of scientists recently surpassed results obtained from traditional “real-image” training methods. 

At the core of the approach is a system called StableRep, which doesn't just use any synthetic images; it generates them through ultra-popular text-to-image models like Stable Diffusion. It’s like creating worlds with words. 

So what’s in StableRep's secret sauce? A strategy called “multi-positive contrastive learning.”

“We're teaching the model to learn more about high-level concepts through context and variance, not just feeding it data,” says Lijie Fan, MIT PhD student in electrical engineering, affiliate of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), lead researcher on the work. “When multiple images, all generated from the same text, all treated as depictions of the same underlying thing, the model dives deeper into the concepts behind the images, say the object, not just their pixels.”

This approach considers multiple images spawned from identical text prompts as positive pairs, providing additional information during training, not just adding more diversity but specifying to the vision system which images are alike and which are different. Remarkably, StableRep outshone the prowess of top-tier models trained on real images, such as SimCLR and CLIP, in extensive datasets.

“While StableRep helps mitigate the challenges of data acquisition in machine learning, it also ushers in a stride towards a new era of AI training techniques. The capacity to produce high-caliber, diverse synthetic images on command could help curtail cumbersome expenses and resources,” says Fan. 

The process of data collection has never been straightforward. Back in the 1990s, researchers had to manually capture photographs to assemble datasets for objects and faces. The 2000s saw individuals scouring the internet for data. However, this raw, uncurated data often contained discrepancies when compared to real-world scenarios and reflected societal biases, presenting a distorted view of reality. The task of cleansing datasets through human intervention is not only expensive, but also exceedingly challenging. Imagine, though, if this arduous data collection could be distilled down to something as simple as issuing a command in natural language. 

A pivotal aspect of StableRep’s triumph is the adjustment of the “guidance scale” in the generative model, which ensures a delicate balance between the synthetic images’ diversity and fidelity. When finely tuned, synthetic images used in training these self-supervised models were found to be as effective, if not more so, than real images.

Taking it a step forward, language supervision was added to the mix, creating an enhanced variant: StableRep+. When trained with 20 million synthetic images, StableRep+ not only achieved superior accuracy but also displayed remarkable efficiency compared to CLIP models trained with a staggering 50 million real images.

Yet, the path ahead isn't without its potholes. The researchers candidly address several limitations, including the current slow pace of image generation, semantic mismatches between text prompts and the resultant images, potential amplification of biases, and complexities in image attribution, all of which are imperative to address for future advancements. Another issue is that StableRep requires first training the generative model on large-scale real data. The team acknowledges that starting with real data remains a necessity; however, when you have a good generative model, you can repurpose it for new tasks, like training recognition models and visual representations. 

The team notes that they haven’t gotten around the need to start with real data; it’s just that once you have a good generative model you can repurpose it for new tasks, like training recognition models and visual representations. 

While StableRep offers a good solution by diminishing the dependency on vast real-image collections, it brings to the fore concerns regarding hidden biases within the uncurated data used for these text-to-image models. The choice of text prompts, integral to the image synthesis process, is not entirely free from bias, “indicating the essential role of meticulous text selection or possible human curation,” says Fan. 

“Using the latest text-to-image models, we've gained unprecedented control over image generation, allowing for a diverse range of visuals from a single text input. This surpasses real-world image collection in efficiency and versatility. It proves especially useful in specialized tasks, like balancing image variety in long-tail recognition, presenting a practical supplement to using real images for training,” says Fan. “Our work signifies a step forward in visual learning, towards the goal of offering cost-effective training alternatives while highlighting the need for ongoing improvements in data quality and synthesis.”

“One dream of generative model learning has long been to be able to generate data useful for discriminative model training,” says Google DeepMind researcher and University of Toronto professor of computer science David Fleet, who was not involved in the paper. “While we have seen some signs of life, the dream has been elusive, especially on large-scale complex domains like high-resolution images. This paper provides compelling evidence, for the first time to my knowledge, that the dream is becoming a reality. They show that contrastive learning from massive amounts of synthetic image data can produce representations that outperform those learned from real data at scale, with the potential to improve myriad downstream vision tasks.”

Fan is joined by Yonglong Tian PhD ’22 as lead authors of the paper, as well as MIT associate professor of electrical engineering and computer science and CSAIL principal investigator Phillip Isola; Google researcher and OpenAI technical staff member Huiwen Chang; and Google staff research scientist Dilip Krishnan. The team will present StableRep at the 2023 Conference on Neural Information Processing Systems (NeurIPS) in New Orleans.

Source: Synthetic imagery sets new bar in AI training efficiency

How do reasonable people disagree?

- Posted in Uncategorized by

U.S. politics is heavily polarized. This is often regarded as a product of irrationality: People can be tribal, are influenced by their peers, and often get information from very different, sometimes inaccurate sources.

Tribalism and misinformation are real enough. But what if people are often acting rationally as well, even in the process of arriving at very different views? What if they are not being misled or too emotional, but are thinking logically?

“There can be quite reasonable ways people can be predictably polarized,” says MIT philosopher Kevin Dorst, author of a new paper on the subject, based partly on his own empirical research.

This may especially be the case when people deal with a lot of ambiguity when weighing political and civic issues. Those ambiguities generate political asymmetry. People consider evidence in predictably different ways, leading them to different conclusions. That doesn’t mean they are not thinking logically, though.

“What’s going is people are selectively scrutinizing information,” Dorst says. “That’s effectively why they move in opposite directions, because they scrutinize and selectively look for flaws in different places, and so they get overall different takes.”

The concept of rational polarization may help us develop a more coherent account about how views differ, by helping us avoid thinking that we alone are rational — or, conversely, that we have done no real thinking while arriving at our own opinions. Thus it can add nuance to our assessments of others.

The paper, “Rational Polarization,” appears in The Philosophical Review. Dorst, the sole author, is an assistant professor in MIT’s Department of Linguistics and Philosophy.

Looking for flaws

To Dorst, rational polarization stands as a useful alternative to other models about belief formation. In particular, rational polarization in his view improves upon one type of model of “Bayesian” thinking, in which people keep using new information to hone their views.

In Bayesian terms, because people use new information to update their views, they will rationally either change their ideas or not, as is warranted. it, But in reality, Dorst asserts, things are not so simple. Often when we assess new evidence, there is ambiguity present — and Dorst contends that it is rational to be unsure about that ambiguity. But this can generate polarization because people’s prior assumptions do influence the places where they find ambiguity.

Suppose a group of people have been given two studies about the death penalty: One study finds the death penalty has no deterrent effect on people’s behavior, and the other study finds it does. Even reading the same evidence, people in the group will likely wind up with different interpretations of it.

“Those who really believe in the deterrent effect will look closely at the study suggesting there is no deterrent effect, be skeptical about it, poke holes in the argument, and claim to recognize flaws in its reasoning,” Dorst says. “Conversely, for the people who disbelieve the deterrent effect, it’s the exact opposite. They find flaws in the study suggesting there is a deterrent effect.”

Even to these seemingly selective readings can be rational, Dorst says: “It makes sense to scrutinize surprising information more than unsurprising information.” Therefore, he adds, “You can see that people who have this tendency to selectively scrutinize [can] drift apart even when they are presented with the same evidence that’s mixed in the same way.”

By the letter

To help show that this habit exists, Dorst also ran an online experiment about ambiguity, with 250 participants on the Prolific online survey platform. The aim was to see how much people’s views might become polarized in the presence of ambiguous information.

The participants were given an incomplete string of letters, as one might find in a crossword puzzle or on “Wheel of Fortune.” Some letter strings were parts of real words, and some were not. Depending on what kinds of additional information participants were given, the ambiguous, unsolvable strings of letters had a sharply polarizing effect on how people reacted to the additional information they received.

This process at work in the experiment, Dorst says, is similar to what happens when people receive uncertain information, in the news or in studies, about political matters.

“When you find a flaw, it gives you clear evidence that undermines the study,” Dorst says. Otherwise, people often tend to be uncertain about the material they see. “When you don’t find a flaw, it [can] give you ambiguous evidence and you don’t know what to make of it. As a result, that can lead to predictable polarization.”

The larger point, Dorst believes, is that we can arrive at a more nuanced and consistent picture of how political differences exist when people process similar information.

“There’s a perception that in politics, rational brains shut off and people think with their guts,” Dorst says. “If you take that seriously, you should say, ‘I form my beliefs on politics in the same ways.’”

Unless, that is, you believe you alone are rational, and everyone else is not — though Dorst finds this to be an untenable view of the world.

“Part of what I’m trying to do is give an account that’s not subject to that sort of instability,” Dorst says. “You don’t necessarily have to point the finger at others. It’s a much more interesting process if you think there’s something [rational] there as well.”

Source: How do reasonable people disagree?

Ingestible electronic device detects breathing depression in patients

- Posted in Uncategorized by

Diagnosing sleep disorders such as sleep apnea usually requires a patient to spend the night in a sleep lab, hooked up to a variety of sensors and monitors. Researchers from MIT, Celero Systems, and West Virginia University hope to make that process less intrusive, using an ingestible capsule they developed that can monitor vital signs from within the patient’s GI tract.

The capsule, which is about the size of a multivitamin, uses an accelerometer to measure the patient’s breathing rate and heart rate. In addition to diagnosing sleep apnea, the device could also be useful for detecting opioid overdoses in people at high risk, the researchers say.

“It’s an exciting intervention to help people be diagnosed and then receive the appropriate treatment if they suffer from obstructive sleep apnea,” says Giovanni Traverso, an associate professor of mechanical engineering at MIT and a gastroenterologist at Brigham and Women’s Hospital. “The device also has the potential for early detection of changes in respiratory status, whether it’s a result of opiates or other conditions that could be monitored, like asthma or chronic obstructive pulmonary disease (COPD).”

In a study of 10 human volunteers, the researchers showed that the capsule can be used to monitor vital signs and to detect sleep apnea episodes, which occur when the patient repeatedly stops and starts breathing during sleep. The patients did not show any adverse effects from the capsule, which passed harmlessly through the digestive tract.

Traverso is one of the senior authors of the study, along with Robert Langer, an MIT Institute Professor and member of MIT’s Koch Institute for Integrative Cancer Research; Victor Finomore, director of the Human Performance and Applied Neuroscience Research Center at the West Virginia University School of Medicine; and Ali Rezai, director of the Rockefeller Neuroscience Institute at the West Virginia University School of Medicine. The paper appears today in the journal Device.

Vital sign measurements

Over the past decade, Traverso and Langer have developed a range of ingestible sensors that could be used to monitor vital signs and diagnose disorders of the GI tract, such as gastrointestinal slowdown and inflammatory bowel diseases.

This new study focused on measuring vital signs, using a capsule developed by Celero Systems that includes an accelerometer that detects slight movements generated by the beating of the heart and the expansion of the lungs. The capsule also contains two small batteries and a wireless antenna that transmits data to an external device such as a laptop.

In tests in an animal model, the researchers found that this capsule could accurately measure breathing rate and heart rate. In one experiment, they showed that the sensor could detect the depression of breathing rate that resulted from a large dose of fentanyl, an opioid drug.

Building on those results, the researchers decided to further test the capsule in a clinical trial at the West Virginia University Rockefeller Neuroscience Institute. Ten patients who enrolled in the study were monitored using the ingestible capsule, and these patients were also connected to the sensors typically used to monitor sleep, so the researchers could compare measurements from both types of sensors.

The researchers found that their ingestible sensor was able to accurately measure both breathing rate and heart rate, and it also detected a sleep apnea episode that one of the patients experienced.

“What we were able to show is that using the capsule, we could capture data that matched what the traditional transdermal sensors would capture,” Traverso says. “We also observed that the capsule could detect apnea, and that was confirmed with standard monitoring systems that are available in the sleep lab.”

In this study, the researchers monitored signals emitted by the capsule while it was in the stomach, but in a previous study, they showed that vital signs can also be measured from other parts of the GI tract.

“The stomach generally offers some of the best signals, mainly because it’s close to the heart and the lungs, but we know that we can also sense them elsewhere,” Traverso says.

None of the patients reported any discomfort or harm from the capsule. Radiographic imaging performed 14 days after the capsules were ingested revealed that all of them had passed through the patients’ bodies. The research team’s previous work has shown that objects of similar size usually move through the digestive tract in a little more than a day.

Close monitoring

The researchers envision that this kind of sensor could be used to diagnose sleep apnea in a less intrusive way than the skin-based sensors that are now used. It could also be used to monitor patients when they begin treatment for apnea, to make sure that the treatments are effective.

Celero Systems, a company founded by Traverso, Langer, Jeremy Ruskin, a professor of medicine at Harvard Medical School, and Benjamin Pless, now CEO of the company, is now working on sensors that could be used to detect sleep apnea or opioid overdose.

“We know that people who have had an overdose are at higher risk of recurrence, so those individuals could be monitored more closely so that in the event of another overdose, someone could help them,” Traverso says.

In future work, the researchers hope to incorporate an overdose reversal agent such as nalmefene into the device, so that drug release would be triggered when the person’s breathing rate slowed or stopped. They are also working on strategies to lengthen the amount of time that the capsules could remain in the stomach.

The research was funded by the Karl van Tassel Career Professorship, MIT’s Department of Mechanical Engineering, and Celero Systems.

Authors of the paper also include Pless, James Mahoney, Justin Kupec, Robert Stansbury, Daniel Bacher, Shannon Schuetz, and Alison Hayward.

Source: Ingestible electronic device detects breathing depression in patients

How cell identity is preserved when cells divide

- Posted in Uncategorized by

Every cell in the human body contains the same genetic instructions, encoded in its DNA. However, out of about 30,000 genes, each cell expresses only those genes that it needs to become a nerve cell, immune cell, or any of the other hundreds of cell types in the body.

Each cell’s fate is largely determined by chemical modifications to the proteins that decorate its DNA; these modification in turn control which genes get turned on or off. When cells copy their DNA to divide, however, they lose half of these modifications, leaving the question: How do cells maintain the memory of what kind of cell they are supposed to be?

A new MIT study proposes a theoretical model that helps explain how these memories are passed from generation to generation when cells divide. The research team suggests that within each cell’s nucleus, the 3D folding of its genome determines which parts of the genome will be marked by these chemical modifications. After a cell copies its DNA, the marks are partially lost, but the 3D folding allows the cell to easily restore the chemical marks needed to maintain its identity. And each time a cell divides, chemical marks allow a cell to restore its 3D folding of its genome. This way, by juggling the memory between 3D folding and the marks, the memory can be preserved over hundreds of cell divisions.

“A key aspect of how cell types differ is that different genes are turned on or off. It's very difficult to transform one cell type to another because these states are very committed,” says Jeremy Owen PhD ’22, the lead author of the study. “What we have done in this work is develop a simple model that highlights qualitative features of the chemical systems inside cells and how they need to work in order to make memories of gene expression stable.”

Leonid Mirny, a professor in MIT’s Institute for Medical Engineering and Science and the Department of Physics, is the senior author of the paper, which appears today in Science. Dino Osmanović, a former postdoctoral fellow at MIT’s Center for the Physics of Living Systems, is also an author of the study.

Maintaining memory

Within the cell nucleus, DNA is wrapped around proteins called histones, forming a densely packed structure known as chromatin. Histones can display a variety of modifications that help control which genes are expressed in a given cell. These modifications generate “epigenetic memory,” which helps a cell to maintain its cell type. However, how this memory is passed on to daughter cells is somewhat of a mystery.

Previous work by Mirny’s lab has shown that the 3D structure of chromosomes is, to a great extent, determined by these epigenetic modifications, or marks. In particular, they found that certain chromatin regions, with marks telling cells not to read a particular segment of DNA, attract each other and form dense clumps called heterochromatin, which are difficult for the cell to access.

In their new study, Mirny and his colleagues wanted to answer the question of how those epigenetic marks are maintained from generation to generation. They developed a computational model of a polymer with a few marked regions, and saw that these marked regions collapse into each other, forming a dense clump. Then they studied how these marks are lost and gained.

When a cell copies its DNA to divide it between two daughter cells, each copy gets about half of the epigenetic marks. The cell then needs to restore the lost marks before the DNA is passed to the daughter cells, and the way chromosomes were folded serves as a blueprint for where these remaining marks should go.

These modifications are added by specialized enzymes known as “reader-writer” enzymes. Each of these enzymes is specific for a certain mark, and once they “read” existing marks, they “write” additional marks at nearby locations. If the chromatin is already folded into a 3D shape, marks will accumulate in regions that already had modifications inherited from the parent cell.

“There are several lines of evidence that suggest that the spreading can happen in 3D, meaning if there are two parts that are near each other in space, even if they're not adjacent along the DNA, then spreading can happen from one to another,” Owen says. “That is how the 3D structure can influence the spreading of these marks.”

This process is analogous to the spread of infectious disease, as the more contacts that a chromatin region has with other regions, the more likely it is to be modified, just as an individual is more likely to become infected as their number of contacts increases. In this analogy, dense regions of marked chromatin are like cities where people have many social interactions, while the rest of the genome is comparable to sparsely populated rural areas.

“That essentially means that the marks will be spreading in the dense region and will be very sparse anywhere outside it,” Mirny says.

The new model also suggests possible parallels between epigenetic memories stored in a folded polymer and memories stored in a neural network, he adds. Folding of marked regions can be thought of as analogous to the strong connections formed between neurons that fire together in a neural network.

“Broadly this suggests that akin to the way neural networks are able to do very complex information processing, the epigenetic memory mechanism we described may be able to process information, not only store it,” he says.

“One beautiful aspect of the work is how it offers and explores connections with ideas from the seemingly very distant corners of science, including spreading of infections (to describe formation of new chemical marks in the 3D vicinity of the existing one), associative memory in model neural networks, and protein folding,” says Alexander Grosberg, a professor of physics at New York University, who was not involved in the research.

Epigenetic erosion

While this model appeared to offer a good explanation for how epigenetic memory can be maintained, the researchers found that eventually, reader-writer enzyme activity would lead to the entire genome being covered in epigenetic modifications. When they altered the model to make the enzyme weaker, it didn’t cover enough of the genome and memories were lost in a few cell generations.

To get the model to more accurately account for the preservation of epigenetic marks, the researchers added another element: limiting the amount of reader-writer enzyme available. They found that if the amount of enzyme was kept between 0.1 and 1 percent of the number of histones (a percentage based on estimates of the actual abundance of these enzymes), their model cells could accurately maintain their epigenetic memory for up to hundreds of generations, depending on the complexity of the epigenetic pattern.

It is already known that cells begin to lose their epigenetic memory as they age, and the researchers now plan to study whether the process they described in this paper might play a role in epigenetic erosion and loss of cell identity. They also plan to model a disease called progeria, in which cells have a genetic mutation that leads to loss of heterochromatin. People with this disease experience accelerated aging.

“The mechanistic link between these mutations and the epigenetic changes that eventually happen is not well understood,” Owen says. “It would be great to use a model like ours where there are dynamic marks, together with polymer dynamics, to try and explain that.”

The researchers also hope to work with collaborators to experimentally test some of the predictions of their model, which could be done, for example, by altering the level of reader-writer enzymes in living cells and measuring the effect on epigenetic memory.

The research was funded by the National Human Genome Research Institute, the National Institute of General Medical Sciences, and the National Science Foundation.

Source: How cell identity is preserved when cells divide

A new ultrasound patch can measure how full your bladder is

- Posted in Uncategorized by

MIT researchers have designed a wearable ultrasound monitor, in the form of a patch, that can image organs within the body without the need for an ultrasound operator or application of gel.

In a new study, the researchers showed that their patch can accurately image the bladder and determine how full it is. This could help patients with bladder or kidney disorders more easily track whether these organs are functioning properly, the researchers say.

This approach could also be adapted to monitor other organs within the body by changing the location of the ultrasound array and tuning the frequency of the signal. Such devices could potentially enable earlier detection of cancers that form deep within the body, such as ovarian cancer.

“This technology is versatile and can be used not only on the bladder but any deep tissue of the body. It’s a novel platform that can do identification and characterization of many of the diseases that we carry in our body,” says Canan Dagdeviren, an associate professor in MIT’s Media Lab and the senior author of the study.

Lin Zhang, an MIT research scientist; Colin Marcus, an MIT graduate student in electrical engineering and computer science; and Dabin Lin, a professor at Xi’an Technological University, are the lead authors of a paper describing the work, which appears today in Nature Electronics.

Wearable monitoring

Dagdeviren’s lab, which specializes in designing flexible, wearable electronic devices, recently developed an ultrasound monitor that can be incorporated into a bra and used to screen for breast cancer. In the new study, the team used a similar approach to develop a wearable patch that can adhere to the skin and take ultrasound images of organs located within the body.

For their first demonstration, the researchers decided to focus on the bladder, partly inspired by Dagdeviren’s younger brother, who was diagnosed with kidney cancer a few years ago. After having one of his kidneys surgically removed, he had difficulty fully emptying his bladder. Dagdeviren wondered if an ultrasound monitor that reveals how full the bladder is might help patients similar to her brother, or people with other types of bladder or kidney problems.

“Millions of people are suffering from bladder dysfunction and related diseases, and not surprisingly, bladder volume monitoring is an effective way to assess your kidney health and wellness,” she says.

Currently, the only way to measure bladder volume is using a traditional, bulky ultrasound probe, which requires going to a medical facility. Dagdeviren and her colleagues wanted to develop a wearable alternative that patients could use at home.

To achieve that, they created a flexible patch made of silicone rubber, embedded with five ultrasound arrays made from a new piezoelectric material that the researchers developed for this device. The arrays are positioned in the shape of a cross, which allows the patch to image the entire bladder, which is about 12 by 8 centimeters when full.

The polymer that makes up the patch is naturally sticky and adheres gently to the skin, making it easy to attach and detach. Once placed on the skin, underwear or leggings can help to hold it in place.

Bladder volume

In a study performed with collaborators at the Center for Ultrasound Research and Translation and Department of Radiology at Massachusetts General Hospital, the researchers showed that the new patch could capture images comparable to those taken with a traditional ultrasound probe, and these images could be used to track changes in bladder volume.

For the study, the researchers recruited 20 patients with a range of body mass indexes. Subjects were first imaged with a full bladder, then with a partially empty bladder, and then with a completely empty bladder. The images obtained from the new patch were similar in quality to those taken with traditional ultrasound, and the ultrasound arrays worked on all subjects regardless of their body mass index.

Using this patch, no ultrasound gel is needed, and no pressure needs to be applied, as with a regular ultrasound probe, because the field of view is large enough to encompass the entire bladder.

To see the images, the researchers connected their ultrasound arrays to the same kind of ultrasound machine used in medical imaging centers. However, the MIT team is now working on a portable device, about the size of a smartphone, that could be used to view the images.

“In this work, we have further developed a path toward clinical translation of conformable ultrasonic biosensors that yield valuable information about vital physiologic parameters. Our group hopes to build on this and develop a suite of devices that will ultimately bridge the information gap between clinicians and patients,” says Anthony E. Samir, director of the MGH Center for Ultrasound Research and Translation and Associate Chair of Imaging Sciences at MGH Radiology, who is also an author of the study.

The MIT team also hopes to develop ultrasound devices that could be used to image other organs within the body, such as the pancreas, liver, or ovaries. Based on the location and depth of each organ, the researchers need to alter the frequency of the ultrasound signal, which requires designing new piezoelectric materials. For some of these organs, located deep within the body, the device may work better as an implant rather than a wearable patch.

“For whatever organ that we need to visualize, we go back to the first step, select the right materials, come up with the right device design and then fabricate everything accordingly,” before testing the device and performing clinical trials, Dagdeviren says.

“This work could develop into a central area of focus in ultrasound research, motivate a new approach to future medical device designs, and lay the groundwork for many more fruitful collaborations between materials scientists, electrical engineers, and biomedical researchers,” says Anantha Chandrakasan, dean of MIT’s School of Engineering, the Vannevar Bush Professor of Electrical Engineering and Computer Science, and an author of the paper.

The research was funded by a National Science Foundation CAREER award, a 3M Non-Tenured Faculty Award, the Sagol Weizmann-MIT Bridge Program, Texas Instruments Inc., the MIT Media Lab Consortium, a National Science Foundation Graduate Research Fellowship, and an ARRS Scholar Award.

Source: A new ultrasound patch can measure how full your bladder is

Technique enables AI on edge devices to keep learning over time

- Posted in Uncategorized by

Personalized deep-learning models can enable artificial intelligence chatbots that adapt to understand a user’s accent or smart keyboards that continuously update to better predict the next word based on someone’s typing history. This customization requires constant fine-tuning of a machine-learning model with new data.

Because smartphones and other edge devices lack the memory and computational power necessary for this fine-tuning process, user data are typically uploaded to cloud servers where the model is updated. But data transmission uses a great deal of energy, and sending sensitive user data to a cloud server poses a security risk.  

Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere developed a technique that enables deep-learning models to efficiently adapt to new sensor data directly on an edge device.

Their on-device training method, called PockEngine, determines which parts of a huge machine-learning model need to be updated to improve accuracy, and only stores and computes with those specific pieces. It performs the bulk of these computations while the model is being prepared, before runtime, which minimizes computational overhead and boosts the speed of the fine-tuning process.    

When compared to other methods, PockEngine significantly sped up on-device training, performing up to 15 times faster on some hardware platforms. Moreover, PockEngine didn’t cause models to have any dip in accuracy. The researchers also found that their fine-tuning method enabled a popular AI chatbot to answer complex questions more accurately.

“On-device fine-tuning can enable better privacy, lower costs, customization ability, and also lifelong learning, but it is not easy. Everything has to happen with a limited number of resources. We want to be able to run not only inference but also training on an edge device. With PockEngine, now we can,” says Song Han, an associate professor in the Department of Electrical Engineering and Computer Science (EECS), a member of the MIT-IBM Watson AI Lab, a distinguished scientist at NVIDIA, and senior author of an open-access paper describing PockEngine.

Han is joined on the paper by lead author Ligeng Zhu, an EECS graduate student, as well as others at MIT, the MIT-IBM Watson AI Lab, and the University of California San Diego. The paper was recently presented at the IEEE/ACM International Symposium on Microarchitecture.

Layer by layer

Deep-learning models are based on neural networks, which comprise many interconnected layers of nodes, or “neurons,” that process data to make a prediction. When the model is run, a process called inference, a data input (such as an image) is passed from layer to layer until the prediction (perhaps the image label) is output at the end. During inference, each layer no longer needs to be stored after it processes the input.

But during training and fine-tuning, the model undergoes a process known as backpropagation. In backpropagation, the output is compared to the correct answer, and then the model is run in reverse. Each layer is updated as the model’s output gets closer to the correct answer.

Because each layer may need to be updated, the entire model and intermediate results must be stored, making fine-tuning more memory demanding than inference

However, not all layers in the neural network are important for improving accuracy. And even for layers that are important, the entire layer may not need to be updated. Those layers, and pieces of layers, don’t need to be stored. Furthermore, one may not need to go all the way back to the first layer to improve accuracy — the process could be stopped somewhere in the middle.

PockEngine takes advantage of these factors to speed up the fine-tuning process and cut down on the amount of computation and memory required.

The system first fine-tunes each layer, one at a time, on a certain task and measures the accuracy improvement after each individual layer. In this way, PockEngine identifies the contribution of each layer, as well as trade-offs between accuracy and fine-tuning cost, and automatically determines the percentage of each layer that needs to be fine-tuned.

“This method matches the accuracy very well compared to full back propagation on different tasks and different neural networks,” Han adds.

A pared-down model

Conventionally, the backpropagation graph is generated during runtime, which involves a great deal of computation. Instead, PockEngine does this during compile time, while the model is being prepared for deployment.

PockEngine deletes bits of code to remove unnecessary layers or pieces of layers, creating a pared-down graph of the model to be used during runtime. It then performs other optimizations on this graph to further improve efficiency.

Since all this only needs to be done once, it saves on computational overhead for runtime.

“It is like before setting out on a hiking trip. At home, you would do careful planning — which trails are you going to go on, which trails are you going to ignore. So then at execution time, when you are actually hiking, you already have a very careful plan to follow,” Han explains.

When they applied PockEngine to deep-learning models on different edge devices, including Apple M1 Chips and the digital signal processors common in many smartphones and Raspberry Pi computers, it performed on-device training up to 15 times faster, without any drop in accuracy. PockEngine also significantly slashed the amount of memory required for fine-tuning.

The team also applied the technique to the large language model Llama-V2. With large language models, the fine-tuning process involves providing many examples, and it’s crucial for the model to learn how to interact with users, Han says. The process is also important for models tasked with solving complex problems or reasoning about solutions.

For instance, Llama-V2 models that were fine-tuned using PockEngine answered the question “What was Michael Jackson’s last album?” correctly, while models that weren’t fine-tuned failed. PockEngine cut the time it took for each iteration of the fine-tuning process from about seven seconds to less than one second on a NVIDIA Jetson Orin, an edge GPU platform.

In the future, the researchers want to use PockEngine to fine-tune even larger models designed to process text and images together.

“This work addresses growing efficiency challenges posed by the adoption of large AI models such as LLMs across diverse applications in many different industries. It not only holds promise for edge applications that incorporate larger models, but also for lowering the cost of maintaining and updating large AI models in the cloud,” says Ehry MacRostie, a senior manager in Amazon’s Artificial General Intelligence division who was not involved in this study but works with MIT on related AI research through the MIT-Amazon Science Hub.

This work was supported, in part, by the MIT-IBM Watson AI Lab, the MIT AI Hardware Program, the MIT-Amazon Science Hub, the National Science Foundation (NSF), and the Qualcomm Innovation Fellowship.

Source: Technique enables AI on edge devices to keep learning over time

Page 1 of 15