AI’s progress has hit a essential constraint: entry to real-world knowledge. Whereas public datasets and net scraping powered AI’s early breakthroughs, at the moment’s fashions demand proprietary knowledge from hospitals, enterprises, studios, and controlled environments – knowledge that’s been locked away behind authorized, technical, and governance limitations. This bottleneck impacts each stage of AI improvement, from pre-training to analysis, forcing mannequin builders to depend on artificial knowledge that may’t absolutely replicate the complexity of human habits and real-world eventualities. Protege addresses this basic hole by making a platform the place knowledge holders can license their proprietary datasets whereas sustaining privateness, IP protections, and compliance – enabling AI builders to entry scientific data, media content material, audio conversations, movement seize knowledge, and different hard-to-find info at scale. Working with knowledge companions throughout healthcare, media, and movement seize, the corporate has aggregated entry to billions of information factors, together with over 3B scientific notes, 100M medical photographs, 500K+ hours of video content material, and 500K+ hours of audio throughout 50+ languages. With their latest acquisition of Calliope Networks and partnerships spanning from the vast majority of “Magnificent Seven” tech firms to lots of of information suppliers, Protege is turning into the central infrastructure layer connecting proprietary knowledge with AI improvement wants.

AlleyWatch sat down with Protege CEO and Co-Founder Bobby Samuels to be taught extra concerning the enterprise, its future plans, latest funding spherical, and far rather more…Who have been your traders and the way a lot did you elevate?

Protege raised $30M in a Collection A1 spherical led by Andreessen Horowitz (a16z). The financing expands the corporate’s $25M Collection A from August 2025 and brings complete funding to roughly $65M since Protege’s founding in 2024. The spherical additionally contains follow-on participation from present traders reminiscent of Footwork, CRV, Bloomberg Beta, Flex Capital, Shaper Capital, and extra.

Inform us concerning the services or products that Protege presents.

Protege is an AI knowledge platform unlocking entry to trusted, real-world knowledge at scale. We’re remodeling how the world’s actual knowledge powers AI — enabling folks and establishments to contribute their data safely and form intelligence constructed on integrity, experience, and human function. We work with non-public knowledge holders throughout healthcare, media, and different industries to license and curate high-quality datasets that AI builders want for coaching, analysis, and benchmarking. Our function is to behave because the connective tissue between these two sides, making it doable to unlock helpful knowledge whereas preserving privateness, IP rights, and regulatory compliance.At its core, Protege is about turning knowledge that’s traditionally been siloed, delicate, or underutilized right into a responsibly ruled asset. We deal with real-world knowledge throughout industries as a result of that’s what finally determines how AI techniques carry out as soon as they go away the lab and function in actual environments.

What impressed the beginning of Protege?

Whereas AI fashions and computer systems have superior quickly, entry to the appropriate knowledge has turn out to be a bottleneck. The overwhelming majority of the world’s Most worthy knowledge, particularly in regulated industries like healthcare, will not be publicly obtainable, and artificial or manufactured knowledge can’t absolutely replicate real-world complexity. Protege was born from the idea that AI’s subsequent leap will come from unlocking real-world knowledge, ethically sourced, expert-curated, and shared on human phrases.My co-founders and I had spent years working in privacy-first knowledge ecosystems, and we noticed a chance to use these classes to AI. We believed there was a greater path ahead than knowledge scraping from the web – one which compensated knowledge holders, revered privateness, and enabled AI builders to coach techniques that may truly work in the true world.

How is Protege totally different?

We’re constructed round licensed, real-world knowledge from day one. When AI builders come to Protege, they’re in search of real-world knowledge: probably the most genuine sign of how folks and techniques truly behave. This isn’t artificial knowledge created by AI nor manufactured knowledge created to simulate human habits. Throughout each stage of the AI improvement lifecycle — from pre-training to post-training to fine-tuning to analysis — AI builders want this knowledge. They’re wanting throughout modalities and industries: healthcare, video, audio, movement seize, gaming, manufacturing, life sciences, actual property, finance, schooling, and plenty of extra. Foundational, multi-modal model-builders (together with the vast majority of the Magnificent Seven) now work with us throughout a number of domains together with dozens of different mannequin builders.We additionally deal with curation and fit-for-purpose datasets reasonably than solely quantity. As AI builders’ wants have matured, they’ve shifted from “extra knowledge” to “the appropriate knowledge,” and our platform is designed to fulfill that demand, whether or not it’s consultant scientific eventualities in healthcare, extremely particular content material in media, or up to date audio and movement seize wants. We unlock income for knowledge suppliers as nicely, empowering knowledge stewards to share their knowledge property safely and assist AI be taught responsibly, in order that progress is each highly effective and consultant of the broader human inhabitants.

What market does Protege goal and the way large is it?

Protege sits on the intersection of AI improvement and proprietary knowledge, serving each AI builders and knowledge holders throughout a number of verticals, reminiscent of healthcare, media, motion-capture, and extra. Essentially, there are 3 bottlenecks to AI progress: compute, fashions, and knowledge. There are already a number of firms within the first two classes value billions, doubtlessly trillions. There’s but to be a dominant participant within the knowledge that’s wanted for AI improvement, and that’s the hole that Protege goals to fill.As AI turns into extra multimodal and extra embedded in real-world workflows, demand for licensed, domain-specific knowledge will solely develop. We consider fixing AI’s knowledge entry drawback is a generational alternative, and the market spans almost each trade touched by AI.

What’s what you are promoting mannequin?

We presently function as a two-sided knowledge platform for AI improvement, the place AI builders buy licensed datasets and knowledge holders are compensated by way of structured agreements. We earn income for facilitating entry and offering value-added companies like curation and de-identification the place acceptable. Over time, now we have additionally expanded into benchmarks and analysis datasets to help AI improvement throughout the complete lifecycle, not simply preliminary coaching.

How are you making ready for a possible financial slowdown?

In our trade, we’ve seen an acceleration in demand throughout the totally different verticals that we serve. Particularly, we really feel well-positioned to benefit from not solely the rising want for knowledge for AI improvement but in addition the rising pattern in direction of moral knowledge licensing for AI throughout industries.This has the potential to supply different firms, organizations, and rights-holders who could also be in industries which can be vulnerable to financial slowdowns a further income stream alternative that didn’t beforehand exist. These are win-win conditions the place knowledge rights holders can profit from their present property, and we as an organization are in a position to assist bundle that knowledge and join knowledge holders with AI builders actively searching for out these proprietary knowledge sources. This helps to insulate us to broader market situations whereas additionally offering others alternatives past their present enterprise traces.

What was the funding course of like?

Protege has been rising rapidly, and we have been seeing clear alerts out there that there was a chance to boost capital in a manner that will meaningfully speed up what we have been already doing: increasing knowledge partnerships, hiring thoughtfully, and staying versatile round potential strategic alternatives. a16z stood out as the appropriate accomplice given their depth in knowledge infrastructure, AI, and healthcare, in addition to the long-term orientation they convey to firm constructing.This spherical offers us extra alternatives to speed up product improvement, considerably develop Protege’s knowledge community into new domains and knowledge codecs, deepen partnerships with main establishments, and scale the group and infrastructure required to ship AI-ready and rights-protected entry to real-world knowledge. On the identical time, we get to carry on a world-class accomplice who’s deeply related to the ecosystem during which we function.Having Daisy Wolf, Accomplice at a16z, put money into us was an essential a part of that call, given her expertise in healthcare and knowledge is extremely aligned with the place we’re going. The spherical moved rapidly and included continued participation from our present traders, which we see as a powerful vote of confidence in each the enterprise and the path we’re heading.

What are the largest challenges that you just confronted whereas elevating capital?

A giant issue that’s usually neglected is how we convey our imaginative and prescient for the world and the way we as an organization match into it when the world is altering so rapidly. That is very true within the AI house, the place new fashions are launched what looks as if each week, and innovation (and disruption) is going on left and proper. So having a transparent and crisp imaginative and prescient that we will clearly talk to traders is paramount to making sure that we see eye-to-eye with them rapidly. This helps traders develop conviction in our imaginative and prescient and mission rapidly, whereas additionally making certain that we really feel assured that we’ve chosen the appropriate accomplice for the lengthy haul.

What elements about what you are promoting led your traders to jot down the verify?

For years, the open web powered speedy advances in AI—however that useful resource is now largely exhausted. Public datasets, reminiscent of Frequent Crawl, seize solely a small slice of the net, whereas the overwhelming majority of high-value knowledge lives offline, inside hospitals, enterprises, studios, and different regulated or proprietary environments. The true bottleneck has shifted to accessing real-world knowledge responsibly. Buyers see Protege as important infrastructure for that subsequent part, enabling licensed, privacy-preserving entry to the info AI techniques must carry out reliably in apply. As well as, people famous the power of the group from quite a lot of backgrounds, starting from healthcare knowledge to media to tech startups and extra.

For years, the open web powered speedy advances in AI—however that useful resource is now largely exhausted. Public datasets, reminiscent of Frequent Crawl, seize solely a small slice of the net, whereas the overwhelming majority of high-value knowledge lives offline, inside hospitals, enterprises, studios, and different regulated or proprietary environments. The true bottleneck has shifted to accessing real-world knowledge responsibly. Buyers see Protege as important infrastructure for that subsequent part, enabling licensed, privacy-preserving entry to the info AI techniques must carry out reliably in apply. As well as, people famous the power of the group from quite a lot of backgrounds, starting from healthcare knowledge to media to tech startups and extra.

What are the milestones you propose to realize within the subsequent six months?

Within the subsequent six months, Protege goals to develop its verticals previous healthcare, audiovisual, and movement seize, with the purpose of turning into a trusted supply of licensed, real-world knowledge throughout domains.Past simply coaching knowledge, the Protege platform plans to evolve to help all phases of the AI mannequin improvement cycle, reminiscent of pre-training, post-training, fine-tuning, analysis & benchmarking, and inference, into its infrastructure, permitting for a extra superior analysis.

What recommendation are you able to supply firms in New York that do not need a recent injection of capital within the financial institution?

Just like earlier eras, the one benefit that smaller firms and startups have that incumbents don’t is pace. Within the age of AI, that is very true – the price of creating new merchandise, testing new concepts, and reaching new companions at scale has by no means been sooner. Whereas this could trigger conventional channels to turn out to be saturated, it does additionally create a world the place it’s by no means been simpler for excellent concepts to achieve the appropriate audiences that care about what you might be constructing.Consequently, leaning into the pace benefit is sort of by no means a nasty concept within the early phases. It will increase the floor space of alternatives, whereas additionally creating extra possibilities to find new insights and pivot as crucial within the ever-changing panorama.

The place do you see the corporate going now over the close to time period?

Over the close to time period, Protege is concentrated on turning into the central platform for real-world, licensed knowledge utilized in AI improvement throughout industries, whereas additionally being the main voice in AI knowledge finest practices for mannequin builders. We consider that human knowledge that’s reflective of human exercise in the true world will proceed to play a better and better a part of AI improvement. We purpose to be the trusted chief for the sort of knowledge within the broader AI ecosystem.

What’s your favourite winter vacation spot in and across the metropolis?

I’m a giant fan of a brand new AI-powered karaoke studio referred to as Beatbox. It’s a ton of enjoyable and a terrific house. (Although full disclosure, my spouse and her cofounder opened it up late final yr.)

Source link

Leave A Reply

Company

Bitcoin (BTC)

$ 90,917.00

Ethereum (ETH)

$ 3,113.86

BNB (BNB)

$ 896.11

Solana (SOL)

$ 139.74
Exit mobile version