By Katie Elyce Jones, PillarQ

In 2021, archaeologist Markus Eberl of Vanderbilt University visited the Mediterranean port city of Caesarea, Israel, to collect ancient specks of sand and concrete mixes known as mortar. While they may sound unassuming, such mortar samples tell a rich history of the city’s builders and their construction practices.

“What makes [Caesarea] fascinating is we have multiple different cultures coming in,” said Eberl, associate professor of anthropology. “Over 1,300 years the city is being built and modified.”

Caesarea was founded by King Herod around 30 B.C.E. under Roman rule, when a harbor was constructed. In the first millennium, the city became part of the Byzantine Empire before being conquered and reconquered by Arab and then later crusader and Muslim forces.

Along the city’s ancient walls and foundations, Eberl scooped teaspoon-sized samples of mortars dating back as far as the Roman Empire.

“I’m interested in identifying different materials,” he said. “Ancient peoples used mortars with crushed olive pits, different kinds of sand and volcanic ash, for example.”

Eberl worked with local archeologists and conservators who advised him where he could find ancient samples (rather than restored materials) and how much he could take.

Even small samples of these ancient materials can yield millions of pieces of information in the form of data points thanks to the third-millennium technology of artificial intelligence (AI).

In collaboration with Vanderbilt’s Data Science Institute, Eberl and colleagues are uncovering new information about Caesarea’s cultural history through its mortars. Ultimately, they hope to better understand how ancient builders created their mortar compositions and what they learned from predecessors across time and contemporaries across distances.

Back in Tennessee on Vanderbilt’s campus, the mortar samples are funneled into an industrial particle analyzer (a Partan 3D) that records high-speed, high-resolution video of the incoming particles. Then, 10 images of each particle are generated—amounting to hundreds of thousands of images for analysis.

The practice of examining physical samples for historical significance is nothing new. However, as imaging technology like Vanderbilt’s particle analyzer becomes more advanced, data capture increases and so does the burden of analysis.

“This is what archeologists do—they look manually at these particles under a microscope,” Eberl said. “Trying to look at hundreds of thousands of particles under a microscope is tedious, time consuming, and subjective. At least in my field, archeologists have moved away from this type of analysis because people distrust the results.”

This is where data scientists are helping.

“The entire idea behind the Data Science Institute at Vanderbilt is to empower researchers using these new tools so they can do things that would previously not have been possible,” said Jesse Spencer-Smith, chief data scientist.

Spencer-Smith explained that, in addition to the images, the analyzer captures about 40 characteristics, such as size, roundness, and opacity. “This project is like a case study in data science,” he said. From about 50 samples, the team gathers about 16 million photos and over 500 million data points.

In fall 2022, the team, including students, developed a convolutional neural network (CNN) to analyze particle characteristics. The CNN uses supervised learning on structured data—or labeled particle samples in this case.

“To train our models, I actually collected different material samples—volcanic ash, local sand, olive pit refuse, and so forth. So, we’re not just working with the actual mortar sample,” Eberl said.

Postdoctoral Fellow Abbie Petulante said one of the big challenges was handling this massive amount of data, including making results reproducible.

“We wanted to make sure that anyone could go grab the exact same dataset and use it in the same way,” Petulante said. “We built a pipeline that analyzes the data from a particle analyzer to a computer to Hugging Face and processes that data.”

Now, anyone can pull the files from Hugging Face, an open-source platform for sharing machine learning models and datasets.

The work also became an opportunity for students to learn and test the capabilities of AI models. Senior Data Scientist Charreau Bell created the idea to gamify learning through a website where students can play as a data classifier against the machine learning algorithm.

“We collaborate as research teams,” Spencer-Smith said. “Markus’ students participate in training with us, developing the models themselves, and then we take them a bit further.”

This spring the team plans to expand image analysis from supervised learning with the CNNs to deep learning with transformer models that can learn semi- or unsupervised.

“It was clear that there were things available in the images that were not available in the structured data,” Spencer-Smith said. “We’re seeing how well we can use that type of technology versus CNNs versus more traditional machine learning.”

With new insights, they hope to further differentiate between particles. For instance, they may be able to pin down whether similar particles are from different quarries, which would give them information about the sourcing of materials.

“We hope to identify sources where people traveled further to get different quality ingredients,” Eberl said. The Romans, for example, often used volcanic ash that would have been transported to Caesarea from Italy.

Eberl plans to begin releasing results in academic papers later this year. The project is funded by the National Endowment for the Humanities (NEH). He said the NEH wants “to fund moonshots in the humanities where they tend to create novel approaches to problems that were previously not approachable or solvable.”

While many may still consider AI a technology of the future, through collaborations like this between subject matter experts and data scientists, it’s already helping uncover new information about the ancient past.