How We Score 1.98 Million Products
A transparent look at the data sources, scoring algorithms, and validation processes behind our two-score system for measuring food processing and nutrition.
Overview and Mission
Ultra Processed Food List exists to answer a simple question: how processed is the food you eat? We apply algorithmic scoring to every branded food product in the United States, transforming raw ingredient lists and nutrition facts into two actionable numbers -- a Processing Score and a Nutrition Score.
Our database currently contains 1.98 million products spanning 13 consumer-friendly categories, from 36,000+ brands. Every product receives both scores, calculated through a deterministic pipeline that treats each item identically regardless of brand, price, or marketing claims.
Transparency is central to our mission. This page documents exactly how scores are calculated, what data sources we use, where the system has known limitations, and how we validate results. No black boxes, no proprietary algorithms hidden behind vague descriptions -- just open methodology you can evaluate for yourself.
Why two scores? Processing and nutrition measure fundamentally different things. A food can be heavily processed yet nutritionally dense, or minimally processed yet nutritionally poor. Collapsing both dimensions into a single number loses information consumers need.
Data Sources
Every score in our system traces back to publicly available data from the United States Department of Agriculture. We do not rely on proprietary datasets, manufacturer partnerships, or user submissions.
Primary Source
The Branded Food Products database is our primary source. Manufacturers and retailers submit detailed product information including complete ingredient lists, nutrition facts panels, UPC barcodes, brand names, and product categories.
1.98 million branded products
What We Have Access To
- ✓Full ingredient lists (ordered by weight)
- ✓Complete nutrition facts (calories, macros, vitamins, minerals)
- ✓UPC/barcode identifiers
- ✓Brand names and product descriptions
- ✓Serving size information
Data Freshness
The USDA dataset receives continuous updates from manufacturers. We run our complete processing pipeline quarterly to incorporate new products, updated formulations, and corrected entries.
Between major updates, our database reflects a point-in-time snapshot. Product formulations do change -- always verify current ingredients on the physical label.
Data coverage note: Approximately 93% of products have complete macronutrient data (calories, protein, fat, carbohydrates). Vitamin and mineral data is available for a smaller subset. Image coverage stands at approximately 25% through UPC-based matching. Products without ingredient data cannot receive a Processing Score and are excluded from scoring.
The Two-Score System
We deliberately separated processing measurement from nutrition measurement. Each score captures a distinct dimension of food quality, and the interplay between them reveals patterns that neither score could surface alone.
Processing Score
Lower is better
Measures how far a food has been transformed from its natural state through industrial processing. Based on ingredient count, the presence of artificial additives, preservatives, and manufacturing indicators.
Nutrition Score
Higher is better
Evaluates the nutritional profile based on the presence of beneficial nutrients and the absence of harmful ones. Starts at a baseline and adjusts based on positive and negative factors.
For detailed breakdowns of each scoring system, see our Processing Score Guide and Nutrition Score Guide.
Processing Score Deep Dive
The Processing Score is built from two components: a base score determined by the number of ingredients, and a set of additive penalties triggered by the presence of specific processing markers. The final score is the sum of both.
Base Scores by Ingredient Count
| Ingredient Count | Base Score | Description |
|---|---|---|
| 1 ingredient | 1.0 | Single-ingredient foods (olive oil, plain rice, raw honey) |
| 2-3 ingredients | 1.5 | Simple combinations (salted butter, nut butter with salt) |
| 4-6 ingredients | 3.5 | Basic processed foods (simple bread, canned soup) |
| 7-10 ingredients | 5.0 | Standard processed foods (crackers, sauces, dressings) |
| 11-15 ingredients | 6.5 | Complex processed foods (frozen meals, protein bars) |
| 16+ ingredients | 8.0+ | Heavily formulated products (score continues to scale with count) |
Processing Penalties
When our algorithm detects specific processing markers in the ingredient list, additional penalty points are added to the base score. A single product can trigger multiple penalties.
| Processing Marker | Penalty | Why It Matters |
|---|---|---|
| Artificial ingredients | +2.0 | Artificial colors, flavors, or sweeteners indicate industrial formulation not replicable in a home kitchen |
| High fructose corn syrup (HFCS) | +1.5 | Industrially produced sweetener requiring enzymatic conversion of corn starch |
| Hydrogenated oils | +1.5 | Industrial hydrogenation process that creates trans fats and extends shelf life |
| BHA/BHT preservatives | +1.5 | Synthetic antioxidants used to prevent rancidity in processed foods |
| Modified ingredients | +0.8 | Modified starches, protein isolates, and chemically altered food components |
| Other preservatives | +0.6 | Sodium benzoate, potassium sorbate, calcium propionate, and similar shelf-life extenders |
Example Calculation
Four Processing Levels
Final Processing Scores are mapped to four levels that provide plain-language context. The thresholds and current distribution across our database:
| Level | Score Range | Label | % of Products | Typical Products |
|---|---|---|---|---|
| Level 1 | ≤ 2.5 | Minimally Processed | ~9.2% | Single-ingredient foods, plain nuts, pure oils |
| Level 2 | ≤ 5.0 | Processed | ~31.1% | Simple bread, canned vegetables, cheese, yogurt |
| Level 3 | ≤ 8.0 | Highly Processed | ~18.4% | Frozen meals, flavored yogurts, granola bars |
| Level 4 | > 8.0 | Ultra-Processed | ~41.3% | Snack cakes, soft drinks, candy, instant meals |
Nutrition Score Deep Dive
The Nutrition Score evaluates what a food provides nutritionally, independent of how it was manufactured. Scores range from 0 to 10, with higher scores indicating more nutritional value. The database average is 4.82.
Positive Factors (Increase Score)
| Factor | Max Bonus |
|---|---|
| Protein content | +3.0 |
| Fiber content | +2.0 |
| Fermented dairy | +1.0 |
Negative Factors (Decrease Score)
| Factor | Max Penalty |
|---|---|
| Added sugars | -3.0 |
| Sodium | -2.0 |
| Saturated fat | -2.0 |
| Trans fat | -1.5 |
How Our System Relates to NOVA
The NOVA food classification system is the most widely cited academic framework for categorizing food by processing level. Our system was informed by NOVA principles but designed to address specific limitations of categorical classification.
| Aspect | NOVA System | Our System |
|---|---|---|
| Scale type | 4 categorical groups | Continuous score (1–32+) |
| Within-group distinction | None -- all Group 4 foods are equivalent | Full gradation (8.5 vs 22.0 are visibly different) |
| Nutrition assessment | Not included | Separate 0–10 Nutrition Score |
| Assignment method | Manual classification by trained researchers | Algorithmic analysis of ingredient lists |
| Scale | Typically applied to hundreds or thousands of products in studies | Applied to 1.98 million products automatically |
Where the Systems Agree
- ✓Single-ingredient whole foods consistently receive the lowest processing classifications in both systems
- ✓Products with artificial additives, HFCS, and hydrogenated oils are flagged as highly processed by both systems
- ✓The same ingredient markers (emulsifiers, protein isolates, artificial sweeteners) drive classification in both frameworks
Where They Differ
- •NOVA treats all ultra-processed foods equally; our system distinguishes a PS of 8.5 from a PS of 25.0
- •NOVA does not account for nutritional content; we provide a separate Nutrition Score for that dimension
- •NOVA classification requires expert judgment for edge cases; our algorithm applies the same rules to every product deterministically
Our position: NOVA is a valuable research framework and our system is complementary to it, not a replacement. We use a continuous score because consumers benefit from knowing that a product scoring 9.0 is meaningfully less processed than one scoring 22.0, even though both would fall into NOVA Group 4. For a full explanation of NOVA, see our NOVA Food Classification Guide.
Limitations and Edge Cases
No scoring system is perfect. We believe transparency about limitations is more valuable than projecting false precision. Here are the known areas where our methodology has weaknesses or produces counterintuitive results.
Products That Score Unexpectedly High
Fortified health foods
Protein powders, meal replacement shakes, and fortified cereals often contain protein isolates, emulsifiers, and synthetic vitamins that trigger processing penalties -- even though these products are marketed as healthy. The Nutrition Score helps balance this by reflecting actual nutrient content.
Products with long but simple ingredient lists
A trail mix with 15 types of nuts, seeds, and dried fruits will receive a higher base score than a product with 5 ingredients, even though each individual ingredient is minimally processed. The base score reflects ingredient complexity, which is an imperfect proxy for processing level.
Products That Score Unexpectedly Low
Simple but nutritionally poor foods
A candy made from just sugar, corn syrup, and food coloring will score lower on processing than a complex whole-grain bread with 12 identifiable ingredients. Fewer ingredients means a lower base score, even when the product provides little nutritional value.
Incomplete ingredient data
Some products in the USDA database have abbreviated or incomplete ingredient lists. When ingredients are missing from the data, our algorithm cannot detect processing markers it cannot see. These products may receive artificially low scores.
Known Data Gaps
Categories that are harder to classify: Supplements, baby food, and ethnic/specialty foods present particular challenges. Supplements often contain dozens of synthetic vitamins and minerals that inflate processing scores even when the base product is simple. Baby food formulations vary widely. Ethnic and specialty foods may use traditional ingredients that our detection algorithm does not yet recognize as processing markers (or incorrectly flags as such). We continue to refine these categories with each pipeline update.
Update Process and Data Quality
Our data pipeline is a 13-step incremental process. Each step runs independently, validates its output, and can be rolled back without affecting other steps. This architecture allows us to update individual components (such as ingredient normalization) without reprocessing the entire database.
The 13-Step Pipeline
Validation at Every Step
Row Count Verification
Each step confirms that expected row counts are maintained and no products are silently dropped from the pipeline.
Distribution Checks
Score distributions are compared against expected ranges. If Level 4 products suddenly drop below 30% or above 50%, the step is flagged for manual review.
Rollback Capability
Each step operates on a working copy. If validation fails, the step is rolled back and the previous state is preserved without affecting downstream data.
Processing time: The full 13-step pipeline takes approximately 4 hours to complete, with each step using no more than 4GB of memory. This incremental approach replaced an earlier monolithic pipeline that required 12GB+ of RAM and offered no per-step rollback. The production database backup is preserved separately and is never modified by the pipeline.
How to Use These Scores
Scores are tools for comparison, not absolute judgments. They are most useful when comparing similar products within the same category or when evaluating the overall composition of your grocery basket.
Effective Uses
- ✓Compare brands within a category: If you are choosing between three pasta sauces, search our database to find which has the lowest Processing Score
- ✓Spot hidden processing: Two products may look similar on the front label, but their ingredient lists -- and therefore their scores -- can differ significantly
- ✓Explore categories and brands: See which food categories and brands trend toward lower processing
- ✓Read both scores together: A product with PS 12.0 / NS 7.5 tells a different story than PS 12.0 / NS 1.5
What Scores Cannot Tell You
- ×Whether a food is "good" or "bad": Scores are descriptive, not prescriptive. Dietary context, individual health needs, portion sizes, and overall eating patterns matter enormously
- ×Real-time formulation changes: Manufacturers update recipes regularly. Our scores reflect the most recent USDA data, which may lag behind current labels
- ×Allergen safety: While we detect some allergen indicators, always read the actual product label for allergen warnings
- ×Health outcomes: We do not make health claims. Processing level correlates with certain dietary patterns but does not determine individual health outcomes
Frequently Asked Questions
Why do some healthy foods score poorly?
Our Processing Score measures the degree of industrial transformation, not nutritional value. A fortified protein bar may contain beneficial nutrients like protein, fiber, and vitamins, but if it also contains protein isolates, emulsifiers, artificial sweeteners, and stabilizers, its Processing Score will reflect that industrial complexity. This is exactly why we use two separate scores: a product can have a high Processing Score (indicating heavy industrial processing) while still earning a strong Nutrition Score (indicating meaningful nutrient content). The two scores together give a more complete picture than either one alone.
How often is the data updated?
Our database is built on the USDA FoodData Central Branded Food Products dataset, which receives updates from manufacturers throughout the year. We process major data refreshes on a quarterly basis, running our complete 13-step incremental pipeline to incorporate new products, updated ingredient lists, and revised nutrition facts. Between major refreshes, we apply targeted corrections when significant errors are identified. Each update goes through validation at every step before reaching production.
Why don't you use the NOVA system directly?
NOVA is an excellent categorical framework and our system is informed by its principles. However, NOVA assigns all ultra-processed foods to a single group (Group 4), which means a lightly sweetened yogurt with one emulsifier and a heavily processed snack cake with 30 additives receive the same classification. Our continuous Processing Score (1 to 32+) preserves the gradations within each category, letting consumers see that a product scoring 8.5 is meaningfully different from one scoring 22.0. We also pair the Processing Score with a separate Nutrition Score, which NOVA does not address.
Can a product have a high Processing Score but good Nutrition Score?
Yes, and this happens more often than you might expect. Fortified breakfast cereals, protein powders, and meal replacement shakes frequently score above 8.0 on the Processing Score (ultra-processed) while earning 6.0 or higher on the Nutrition Score thanks to added protein, fiber, vitamins, and minerals. Conversely, a product like cotton candy might score moderately on processing (fewer ingredients) but receive a very low Nutrition Score due to being almost entirely sugar. The two-score system captures these nuances that a single score would miss.
Where does your data come from?
Our primary data source is the USDA FoodData Central Branded Food Products database, which contains detailed information submitted by food manufacturers and retailers. This includes ingredient lists, nutrition facts panels, UPC barcodes, brand names, and product categories. We supplement this with UPC-based image matching and affiliate link data from retail sources. All 1.98 million products in our database originate from the USDA dataset, making it the most comprehensive publicly available source of branded food product information in the United States.
Related Guides
What Are Ultra-Processed Foods? →
A comprehensive introduction to ultra-processed foods and why they matter for your diet
NOVA Food Classification →
The academic framework behind food processing classification and how it compares to our system
Processing Score Guide →
Detailed breakdown of our 1-32+ Processing Score with real product examples
Disclaimer: All tools and data visualizations are provided for educational and informational purposes only. They are not intended as health, medical, or dietary advice. Product formulations change frequently — always check the actual label for current ingredients and nutrition facts before making purchasing decisions. Consult healthcare professionals for personalized dietary guidance.