Aristotle API

CA AB 2013 Disclosure

Last Updated: Decemeber 31st, 2025

Harmonic’s Aristotle models have been trained on a mixture of data from licensed datasets and publicly available sources, including data obtained from open-source collections and in-house data.

Such datasets were selected to further Aristotle’s intended purpose of providing advanced mathematical and reasoning capabilities, and were first used on or about August 2023.

The training datasets are large and include both labeled and unlabeled data covering diverse types of data points, including problem statements, thinking traces, and solutions to math problems.

The training data includes both data that is in the public domain and data that may be subject to intellectual property rights, which were used under license, with permission, or pursuant to fair use principles under applicable law. The training datasets do not include any personal information or aggregate consumer information.

The raw training data underwent certain standard preprocessing steps, cleaning, processing and other modifications, including autoformalization (automatically converting informal mathematical statements into formal logic representations), to make the data suitable for Aristotle’s mathematical problem solving. The development of Harmonic’s Aristotle models uses synthetic data generation.

The majority of the data in the training datasets was collected during the period of July 2023 to July 2024, and data collection remains ongoing.