test 1 Astronomy - May 2018 USA

Galaxy Zoo, the public is asked to iden- tify the type of galaxy shown: Is it a disk? Is it edge-on? Is there a central bulge? These features can be quickly identified by eye, but natural variations can make them exceedingly difficult for computers to recognize and categorize. “Humans are actually very well designed to picking out serendipitous discoveries in image datasets,” Fortson says. “By virtue of evolution, humans have developed this amazing visual cor- tex that can differentiate the unknown unknowns from the knowns.” Of course, using the untrained public doesn’t come without its challenges. People make mistakes. Luckily, the large number of people involved in the identi- fication can be used to create averages and a group consensus, which, over the long run, can be even more accurate than a single scientist’s identification. In Galaxy Zoo, 40 different individuals examine each galaxy to create a trusted identification. By carefully processing the results, individual people can even be weighted differently depending on their identification success rate. In this way, people whose identifications gener- ally don’t agree with the group consen- sus can be flagged for rejection, so they don’t skew the end results. Rise of the machines Once the masses have identified and categorized thousands of images, significant work remains to analyze the data. This is where computers finally come in. These machines are the heavy lifters, allowing for complex calculations and comparisons that the human brain would be hard- pressed to match on its own. While machines historically can only do exactly what they are told, a subset of comput- ers are being taught to think on their own. Astronomers are using a type of artificial intelligence, called machine learning, to get computers to teach them- selves how to find patterns in the data. A specific method of machine learning known as artificial neural networks was designed based on how the brain functions. These neural networks draw connections in vast webs of data, just as the human brain does. To create these networks, a scien- tist starts by showing the computer a Galaxies, clusters of galaxies, and clusters of clusters of galaxies join with dark matter to form a grand, weblike structure called the cosmic web, a slice of which is shown here. With the help of artificial neural networks, astronomers hope to run simulations like this to investigate the cosmic web in much greater detail than previously possible. NASA, ESA, AND E. HALLMAN (UNIVERSITY OF COLORADO BOULDER) “training set,” which is a series of exam- ples containing what the computer is looking for — such as spiral galaxies. Over time, and with enough examples, the computer will become adept at identifying spiral galaxies, despite their wide range of appearances. At this point, the scientist can provide the computer with a sample of unidenti- fied galaxies, and the machine will return those that fit the criteria it has assessed. Machines can also be taught a much more difficult task: assessing how objects and their characteristics relate to one another. For example, scientists have used artificial neural networks to investigate how galaxies form clusters and how that group- ing affects the numbers of stars the galaxies produce. Only with the assistance of computers are the scientists able to compare the many physical properties at play, such as galaxy mass, distance between galaxies, and previ- ous interactions between gal- axies. And by comparing many hundreds of thousands of galaxies, scientists are able to make broad conclu- sions about our universe that are unbi- ased by small irregularities. When encoded properly, artificial neural networks can provide profound insight to scientists; however, they can also be easily misused. For example, if the LSST will collect as much as 30 terabytes of data every clear night. training set is not extensive enough, the computer will draw the wrong conclu- sions. Or, as astronomers are fond of repeating, “Garbage in, garbage out.” The other drawback to artificial neu- ral networks is that they require vast datasets to “learn” from. Luckily, in the era of large-scale surveys, vast datasets are common. This means that artificial neural networks can quickly turn the problem of too much data into an advan- tage. The larger the training set — which citizen scientists can help bolster — the better the results. The future of unexpected discoveries “Our ability to collect these humongous datasets is developing in parallel with our ability to interpret these huge data- sets,” says Ivezić. “Both directions are important — people who collect data and people who develop tools to analyze and interpret. Otherwise we’d just be stuck with a huge pile of zeros and ones we couldn’t make sense out of.” With the combination of large-scale surveys, a legion of citizen scientists, and new machine learning techniques, it seems many new unexpected discoveries will soon emerge from the darkness. But as for the nature of those discoveries? Only time can tell. Mara Johnson-Groh is a science writer and photographer who writes about everything under the Sun, and even things beyond it. W W W.ASTR ONOMY.COM 35

test 1 Astronomy - May 2018 USA | Page 35