0:00
/
Generate transcript
A transcript unlocks clips, previews, and editing.

Building the control point for agentic platforms

An interview with Kong CTO Marco Palladino

There is a dialectic between the use case and the platform in enterprise technology, especially in the wake of innovation. Client-server, e-commerce, mobile, cloud? Business users demanded use cases that they believed would increase revenues or reduce costs, and they wanted them as quickly as possible technical architecture be hanged.

This was great in that it demonstrated value, created demand and accelerated organizational learning. This was less great in that created risk and technical debt. Sometimes it foreclosed future options and turned out to be like trying to get to the moon by climbing the tallest tree -- the first fifty feet feel like great progress.

So too with agents. Enterprise technology functions must go quickly to meet business expectations, but also build the platforms that support agentic workloads with resiliency, security and efficiency.

Kong is a provider of API and AI gateways. I spoke with their CTO Marco Palladino to talk about the evolution from API to AI gateways and how AI gateways fit into a broader agentic strategy.

Thanks for reading Prosaic Times — share it with a friend!

Share

Some of the most interesting comments from our discussion:

  • “The dream has always been the same: we can build an assembly line of software so we can create new products faster. We can innovate faster by taking existing APIs and assembling them together — like the Ford assembly line, but for software.”

  • “AI is useless if agents cannot use APIs. Agents consume two things: they consume LLMs for intelligence, and they consume APIs to do something with that intelligence.”

  • “Not appreciating non-determinism in AI is like organizations in the early days adopting containers but deploying them one container per virtual machine. You’re using containers, but not really — you’re losing the whole point.”

  • “We’re going to have agents telling agents to go build agents. It’s going to be agents all the way down.”

  • “But when the agent becomes the buyer, all of a sudden the agent doesn’t care about the billboard. It doesn’t care about the YouTube video. How does an agent decide what product to use? It looks at documentation. It looks at examples, getting-started guides, what other agents have been doing.”

James Kaplan: This is James Kaplan with another ProsaicTimes podcast. I’m here with Marco Palladino from Kong to talk about AI gateways, API gateways, and security in the AI era. Marco, welcome.

Marco Palladino: Thanks for having me.

James Kaplan: Great pleasure. Give us a little bit of your background. Tell us about your journey to Kong.

Marco Palladino: I’m the CTO and co-founder of Kong, and I’ve been doing this API thing for fifteen years now — since the days when people were asking us what an API was. Now, obviously, everybody knows what an API is, and APIs are the backbone of pretty much the digital world, including AI.

So I started Mashape, which was an API marketplace that didn’t go anywhere — it was too early. We then open-sourced our core technology, and that became Kong. Kong started as an API management product, but over time it expanded to encompass all of connectivity: API connectivity, AI connectivity, microservices connectivity, mesh connectivity.

Today we work with approximately 1,000-plus enterprise customers around the world, and we’re very excited about what’s next.

James Kaplan: Some of us are old enough to remember when an API was something exposed from a Windows Dynamic Link Library, and you had to make sure you had the right version of the DLL in the right directory so your application could consume the right version of the API.

Marco Palladino: Everybody remembers DLL Hell.

James Kaplan: Tell us about how central API gateways were in the evolution of a modern artificial intelligence stack. What makes for a good API gateway? What makes for a good user of an API gateway?

Marco Palladino: The concept of an API has been there since forever. Even in the SOA world of the early 2000s, there was always this dream of web services we could use, consume, and integrate into new applications — but it didn’t go anywhere. It was too complex. Nobody wants to write SOAP or consume SOAP. And then new programming languages emerged — JavaScript, Ruby on Rails — where SOAP was very hard to consume. So it really didn’t get any traction whatsoever.

Then RESTful APIs, the modern ones as we know them, really became popular when Google created their own public APIs and Facebook had APIs that let people access the social graph. All of these apps were talking to monolithic backends through an API.

The iPhone really created a whole new need for APIs — not only to create ecosystems of integrations, but also to handle internal communication. We needed an API to connect our app to the backends, which were monolithic back then.

Then something happened in 2013: Docker was invented and released. In 2014, Kubernetes was created and released, and all of a sudden APIs — which used to be an afterthought bolted on top of monolithic applications to connect with a mobile app — were there since day one. We were building API-first because we were building microservices running in containerized Kubernetes environments across one or multiple clouds, and APIs as a means of internal communication became quite essential.

The dream has always been the same: we can build an assembly line of software so we can create new products faster. We can innovate faster by taking existing APIs and assembling them together — like the Ford assembly line, but for software. And I think what we’re seeing now is the latest iteration of APIs as the backbone of the digital world when it comes to AI.

There is no AI without APIs. We need an API to consume the models. We need an API to consume MCP tools, the data, and the systems and services that we want our agents to use. APIs are everywhere. If anything, APIs are going to keep increasing in numbers.

James Kaplan: I like to say every problem in computer science gets solved by abstraction, and APIs are a mechanism for abstraction. Let me ask you about adoption. We all know there’s pervasive API usage in the consumer internet and enterprise SaaS space, especially externally facing. I was wondering if you could comment on the traditional enterprise — banks, pharmaceutical companies. How far did they get in terms of creating APIs that allowed different applications, or different components of applications, to talk to each other? I think that has historically been a bigger lift.

Marco Palladino: You really only have two options: either you have APIs, or you have silos. APIs help us break silos. Many companies have not invested in APIs, or they’ve done it in a non-coordinated way — maybe they have APIs within one team or one product, but they don’t have a repository of APIs that the rest of the organization can look at and use when building new agents, new experiences, new products. Only a few organizations have made that kind of investment, and those are the ones that are going to move and innovate a lot faster.

Look at the best example — and I know it’s a bit of an old example, but it’s still one of the best. Look at Amazon. At one point, Jeff Bezos wrote a memo in the mid-2000s basically telling every team: you have to build APIs for every new product from day one, and if you’re not doing that, you’re fired. APIs were mandated. And because of that top-down push from the CEO and founder himself, Amazon was able to create AWS. Amazon was not the only e-commerce company in the world, and yet it was the only one that was able to generate a multi-hundred-billion-dollar business with AWS — thanks to their API culture and engineering methodology. When everything ships with an API, you can not only integrate it, you can productionize it and start selling it and creating new revenue streams. That’s how AWS was born.

So what’s the status of APIs in the enterprise? You see a few organizations making the investment because they understand APIs as a path to innovate and move faster. And then you see lots of laggards — organizations that are just now waking up to the fact that they cannot capture AI if they don’t invest in APIs. Better late than never, but this is an investment that should have been made ten years ago.

James Kaplan: When you said the alternative to APIs is silos — at many places, the method of integration is still moving flat files around. What drives the laggards? For the folks who have not moved to a more API-forward architecture, is it lack of skills? Lack of investment? A belief that an API-forward architecture is harder to manage? What have been the barriers, in your thinking?

Marco Palladino: Nobody makes the call. I’ve been working with hundreds of top Fortune 500 and Global 2000 companies around the world — we’re a global company across North America and Europe — and I see patterns everywhere. There is a belief that if you build an internal platform that supports APIs, the teams will eventually adopt it. I think that’s only partially true.

At some point, it can’t just be the carrot. You show the teams what needs to be done, you show them that adopting APIs will benefit them. But at some point, you also have to use the stick. Many organizations and their leadership think APIs are someone else’s job, so they never make that top-down call. And eventually you have to call it for what it is: teams, we have to use APIs, because if you don’t use APIs, we cannot reuse anything you’re building in any other product, any other market, or with any other partner.

There must be a top-down leadership call — but it can’t be mandate alone. You can’t tell teams to adopt APIs and then not give them a platform to use. So yes, there needs to be technology, but at some point there needs to be a top-down leadership call as well. Otherwise, you’re not part of this engineering culture. Many organizations fail because they don’t want to make that call — and that can be deadly.

James Kaplan: One of the most interesting examples I’ve seen of a company adopting APIs — they took a very developer-services point of view. They said: we’re going to do all the things we would do if we were a third-party software company selling a developer platform. We’re going to invest heavily in documentation. We’re going to invest in developer support. We’re going to hold conferences with developers. I thought that worked nicely — it was an interesting change management mechanism. They really treated the developers as customers rather than as people who could simply be mandated to use a platform.

Marco Palladino: I fully agree, and that’s the right way to do it. Developers are internal customers — and APIs are products. They have a lifecycle. Part of the problem is that many organizations don’t see APIs as products, but they are. Just as we version websites and mobile apps, adding features and removing them, APIs have a product lifecycle too: we version them, create new features, decommission them. That’s work that needs to be done.

Developers who are busy building features after features need to slow down at some point, look at their API portfolio, clean it up, and treat it as a product. An organization can’t have a great API ecosystem without allocating the right amount of time for developers and teams to curate those APIs. But it pays dividends — once those APIs are consumable, you can reuse them in countless places.

And especially for AI — I can’t stress this enough — AI is useless if agents cannot use APIs. Agents consume two things: they consume LLMs for intelligence, and they consume APIs to do something with that intelligence. Without an ecosystem of APIs to hook into, those agents may be smart but useless, because they can’t connect to anything meaningful for the business. APIs unlock AI.

James Kaplan: Which creates a lot of issues around control and security. So let’s talk about the transition to an AI gateway. Where does an AI gateway fit into the system? How do we think about how an AI gateway relates to things like MCP and A2A?

Marco Palladino: When we think about agents, we’re thinking about smart applications. An agent is coded, runs somewhere. What does it do? It talks to an LLM to determine what the next operation should be, and then it talks to data and services to get the right inputs to do the job it’s supposed to do. For example, if we’re building an agent that does loan origination for a bank, we need access to APIs about our customers — their social security numbers, their KYC information, and so on — and we can’t do that without investment in APIs.

Now, agents using AI will also have to deal with the non-deterministic nature of AI. When we build a traditional application, we know what’s going to happen. But when we use AI, we’re giving the LLM some degree of freedom to determine what needs to happen with an input and what output to generate. That output is non-deterministic compared to how we used to build applications, and in a non-deterministic world, we need guardrails and capabilities to control what AI does. This is the whole area of AI governance.

As more teams build agents, we don’t want them to reinvent the wheel — building their own guardrails for each specific use case. The platform team can’t monitor what all these calls are about, and the organization doesn’t have confidence that agents are doing the right things at the right time. What we need is a platform that manages all these AI interactions: AI governance, AI security, AI optimizations — including the ability to compress prompts to get more out of the tokens we’re spending. All of that can be abstracted away from individual teams by the platform team, which offers it as a service to internal developer customers.

That’s effectively what an AI gateway does: it centralizes those cross-cutting requirements, just as API management did for APIs — where we took authentication, security, rate limiting, traffic control, and encryption and pushed them into a centralized place.

James Kaplan: It’s at least evocative of what cloud security posture management meant as we all started building out cloud architectures — an important control point that governed how you configured things in the cloud.

Marco Palladino: Absolutely. I’m a big believer that whatever technology we put in place for developers — whether to build APIs, AI agents, or microservices — needs to fit very naturally into their workflow and best practices. The platform is there to help, not to get in the way of getting to an outcome.

James Kaplan: So the AI gateway sits between the agent and the large language model externally, and between the agent and various tools — controlling or limiting what goes in and out of the institution, what goes to the large language model and what doesn’t, and which tools can be called under which circumstances. Is that a fair description?

Marco Palladino: That’s exactly right. It sits between the agents and the models and the tools. What is an MCP tool? Think of it as a new API protocol. APIs don’t have to be REST — they can be SOAP, REST, GraphQL, gRPC. MCP is a new protocol that makes it easier for agents not only to consume data or services, but also to discover what data and services are available. MCP bundles key requirements into the protocol itself: bidirectional real-time communication, tool discovery. It packages all of that in a very consumable way for agents.

With that said, it doesn’t have to be MCP. Many agents still consume APIs with traditional function calling and use any protocol they want. MCP has emerged as almost a standard stack for agent development — if you want to build something without overthinking it, MCP gives you an ecosystem that supports you. But it can be MCP or not. And the AI gateway sits in between not only all LLM transactions, but also all MCP tool requests.

Now, how large that MCP ecosystem or API ecosystem is will determine how much agents can do. If an organization hasn’t invested in creating an ecosystem of MCP tools or APIs, developers building agents will be somewhat handicapped — there’s not much for their agent to hook into. MCP is very exciting, and there’s more than just MCP — there’s also A2A, a different protocol that governs agent-to-agent interactions.

James Kaplan: Those of us who struggled with the syntax of Windows API calls in the 1990s appreciate that MCP allows for less finicky syntax in calling APIs.

Marco Palladino: At least we don’t have agents consuming CORBA APIs. There’s that.

James Kaplan: Some things are best left in the past.

Marco Palladino: Yeah.

James Kaplan: One of the things I find especially interesting is the application of non-deterministic controls. Preventing a social security number or credit card number from being sent somewhere is easy — it’s a pattern. Figuring out whether sensitive pricing data should be allowed to go somewhere is more context-dependent. Deciding how many tokens you want an agent to consume can also be very context-dependent — you may not want a hard cap; you may want certain agents to consume more tokens in certain circumstances and fewer in others. Could we talk a bit about non-deterministic controls and how you think about them in the context of an AI gateway?

Marco Palladino: Non-determinism in agents and AI is both a curse and a blessing.

James Kaplan: Of course.

Marco Palladino: Organizations that are embracing AI and doing it right are also embracing the benefits of non-determinism. Think about it: if we’re using LLMs and AI but not embracing non-determinism, then what we have is just a workflow. Why are we using AI at all? We can build workflows the old-fashioned way. Not appreciating non-determinism in AI is like organizations in the early days adopting containers but deploying them one container per virtual machine. You’re using containers, but not really — you’re losing the whole point.

Using AI requires an appreciation that there is going to be non-determinism in the outcomes it produces, and that is the power of AI. But we also need to make sure the outcomes can’t just be all over the place. While outcomes may not be perfectly deterministic, we need a range of acceptable options.

The best way to think about non-deterministic AI is in terms of risk management. When generating a loan for a customer, there’s a certain degree of risk the organization is willing to tolerate — a range within which we’ll generate loans and beyond which we won’t. Thinking about AI-generated outcomes has to work the same way: determine what risk you as an organization are willing to tolerate for specific outcomes, deny outcomes that fall outside that range, and fully appreciate the non-determinism within the range you’ve defined.

Many organizations are struggling with this, and because of it they’re struggling to generate outcomes with AI. The biggest outcomes any organization will generate are the ones tied to the core business — for a bank, that’s customer financial data, loans, money; for a healthcare organization, it’s claims processing. Because the organization doesn’t feel comfortable putting AI in the core business, it will never generate outcomes that are truly impactful. It’s a chicken-and-egg problem. We need to appreciate AI and invest in the right platforms for managing it — so we feel comfortable enough to put AI in the core business processes that will generate outsized outcomes. Without that investment, the organization will never trust AI in the inner workings of the business, and the CFO will eventually ask: we spent $100 million on AI — did we generate $100 million in outcomes? The answer is always going to be no if AI never found its place in the core business. Only by being in the core business can AI generate that return on investment.

James Kaplan: Imagine you have a CIO or CTO who says: I’m convinced we need to bring AI to the core of the business. I’m convinced we need a platform. I’m convinced we need an AI gateway. What are the major design decisions? What are the major architectural choices he or she might face, and what are the reasons that might push you in one direction versus another?

Marco Palladino: They have to think about centralizing governance. We want to decentralize the execution of AI, but governance is very hard to run well in a decentralized way.

James Kaplan: Explain what you mean by centralized governance.

Marco Palladino: With decentralized governance, everyone is on their own — making their own judgment calls about what’s good and what’s not. Centralized governance reduces the risk of adopting AI across the organization, but at the same time you want to give teams a degree of freedom — a bounded degree — to experiment within the governance you’ve established. Within these guardrails, you can move, you can experiment. On one hand, we don’t want to slow down innovation, so we want teams to be able to try new models, new tools, build new agents. At the same time, we want that experimentation to be bounded by centralized governance in such a way that nobody can ever put the organization or customer data at risk.

James Kaplan: What are my red lines? What are the things I can’t compromise on?

Marco Palladino: Customer data is number one. Whatever we do with agents, we cannot put customer data at risk. Organizations will need a strategy to anonymize data going through a model — and to dynamically reinsert that data on the way back, so the model never sees it but end-user experience is unaffected. PII encryption. The organization may also want to determine which models can be consumed and which cannot. We may want developers to use models from trusted vendors, and not allow them to use an untrusted vendor that might learn from all the data and interactions — effectively copying IP and creating organizational risk.

What models are being used? What data is flowing to those models? What MCP tools can agents use? What identity are we giving agents, so we can identify them and determine what they can and can’t do with models, APIs, and MCP tools? There’s a whole agent identity problem: agents are using MCP tools, some of which use third parties — how do we identify the agent, and how do we identify the end user consuming the agent to act on their data? None of this can be reinvented every time a team wants to build an agent. That would be madness — a colossal risk for any organization.

As the industry matures from early experimentation to actually running agents in production, those agents need all of this underlying infrastructure. An AI gateway is a core part of that.

There was a lot of early experimentation in the last two or three years, and now organizations have identified hotspots where agents can help with specific business processes — moving faster, innovating faster. The question now is: how do we enable every team in the organization to become an agentic developer?

James Kaplan: Any other design decisions? We talked about centralized versus decentralized governance. What else is especially important when implementing an AI gateway?

Marco Palladino: There’s a whole area of making agents effective, which can also be centralized. For example, reducing token consumption — optimizing how we leverage AI, especially for organizations that have already found their use case. Their next problem is: how do we make this cheaper? Things like prompt compression, or semantic caching — the ability to understand the semantic meaning of prompts so you can build what is essentially a semantic CDN that doesn’t require hitting an LLM every time. If the meaning of a question has already been captured and cached by another agent, think of it like a CDN, but semantic.

For example, if I ask an LLM “What is the population of New York?” and then separately ask “How many people live in New York?” — I’m using different words but asking the same thing. That could be a cached response. It helps on two dimensions: cost control, which is increasingly sensitive and will become more so as LLMs stop subsidizing every token and start pricing to reflect actual costs — it’s like Uber in the early days, when you paid $3 for a ride that really cost much more. When real costs emerge, CFOs will ask whether we’re generating the right outcomes for the spend. Prompt compression can reduce token consumption while retaining the same semantic meaning: “Please tell me how many people live in New York” compressed to “What’s the population of New York?” — much smaller token count, same meaning.

And then there’s observability: not only measuring what models and MCP tools we’re using and what agents are consuming the most, but also what outcomes we’re generating. Our customers tell us their biggest challenge is understanding outcomes. They can build agents, consume LLMs and MCP tools — but are they actually generating any outcome? How do you quantify the economic impact an agent has generated if you’re not tracking those outcomes?

Even outcome tracking is something that can be centralized — so that when teams build agents, the entire observability stack, from low-level connections all the way up to business intelligence and outcomes, is captured centrally. Teams don’t have to rebuild it every time.

James Kaplan: I can see how an AI gateway might help you track usage. How does it help you track outcomes?

Marco Palladino: The AI gateway sits in between the agents and the models and the MCP tools and APIs they’re consuming. So everything the agent does, the AI gateway is aware of — because it is on the execution path of all of it. By doing so, the organization has a centralized control plane. Think of it as a control tower for AI, where you set up all the governance rules, security rules, data governance rules, optimizations, and observability rules you want. Then teams go build agents. Whenever an agent makes a request, it has to go through the AI gateway infrastructure, and all of that governance and observability gets captured centrally.

We could avoid doing this — but then we have a much bigger problem. Whether governance, security, and observability are established through an AI gateway or not, we still need all of it. We have to enable teams to succeed by removing those cross-cutting requirements.

James Kaplan: You can start to interrogate the traffic that goes to the models, and there’s a lot of insight about business value there. And at the same time, you have a lot of insight about cost — which is interesting, because for the first time in a long time, the marginal cost of compute may be relevant relative to business value, rather than being a small fraction of it. CPU has been cheap; GPU is expensive.

Marco Palladino: I would argue CPU will be cheaper only for now. I’m also a big believer in an autonomous agentic world where agents are going to be not only intermediaries but the actual buyers of software. In an agentic economy where agents are automating more and more business transactions, we’re going to hit a bottleneck on CPUs — because we’re effectively replacing humans with CPUs. We just haven’t seen that yet because we’re still building out the use cases. But at scale, when every organization runs with an army of agents handling its core business operations, there will absolutely be a CPU shortage. If agents truly become what I think they will, that day is coming.

James Kaplan: We may see a flipping of technology economics. For the past twenty years or so, infrastructure has been cheap and application development has been expensive. AI makes software engineering cheaper but creates massive compute requirements. Whether it’s more GPU than CPU, one way or another it makes infrastructure costs more relevant — and so the marginal cost of compute will matter in a way it hasn’t in several decades, relative to business value.

Marco Palladino: One use case that illustrates this: today, organizations are experimenting with agentic IDEs — Claude Code, Codex, Cursor. Can they help developers build faster, or build more, by leveraging AI? Today, a human developer is asking prompts in these tools to go build software. How far are we really from having another agent asking the prompts to go build software? Now you’ve removed the entire human component. And that agent will know what to build, or not build, thanks to inputs arriving via APIs or MCP tools. The agent will make the judgment calls a human developer used to make, but autonomously. We’re going to have agents telling agents to go build agents. It’s going to be agents all the way down.

James Kaplan: Anything I neglected to ask about? Anything else we should cover?

Marco Palladino: I think there are two things worth mentioning: a change in distribution, and a change in customer behavior. The distribution one is especially interesting.

Today, businesses invest enormous amounts of money and effort to reach human customers — a billboard on the highway, a TV commercial, a YouTube ad. The internet runs on those commercials. But when the agent becomes the buyer, all of a sudden the agent doesn’t care about the billboard. It doesn’t care about the YouTube video. How does an agent decide what product to use? It looks at documentation. It looks at examples, getting-started guides, what other agents have been doing.

There’s going to be a whole new distribution channel — potentially larger than today’s human distribution channel — where to attract customers, you’re not attracting humans anymore, you’re attracting agents. Everything is going to change when that happens. Every business that relies on digital advertising will find that channel works differently than it does today.

That is going to change the internet as we know it. I think it’s extremely exciting. What’s even more exciting is that those of us in this conversation are not only witnessing it — we have the opportunity to help build it. It’s a builder’s era.

James Kaplan: The distribution point is fascinating. It makes commercial markets more like equity trading, where there’s been no human in the loop for years. What you’re suggesting is that many commercial markets may become agent-based — algorithms, instantiated as agents, transacting with other algorithms instantiated as agents.

Marco Palladino: Exactly. They’re not transacting equities or securities the way a financial trading bot would, but they’re transacting outcomes. They can bid on taking an outcome and completing it. A whole new economy is going to be born from that.

And it sounds futuristic, but it isn’t. It has happened before. Thirty years ago, if you weren’t in the Yellow Pages, your business didn’t exist. Then the internet was born — all of a sudden you needed a .com website, or your business didn’t exist. Then the iPhone: customers moved from websites to mobile apps, and if you didn’t have a mobile app, your business didn’t exist. Well, the customer is moving again.

James Kaplan: What you’re saying applies to tokens as well. You may see secondary markets for tokens, with agents bidding against each other for them.

Marco Palladino: Anything that uses a limited resource will eventually find a way to create a secondary market for it — whether it’s a token or a thirty-year-old Mercedes-Benz, there’s always going to be a market.

James Kaplan: I’m still waiting for someone to set up the first trading desk for tokens.

Marco Palladino: It’s going to happen. And it’s quite exciting. Some people look at this and say AI is going to damage society. I think society will have to evolve, and there will be a transition period. I’m just making an observation — I don’t know exactly what’s going to happen. But with the last industrial revolution, things are much better now than they were then. There was a century of societal upheaval — capitalism versus communism, all kinds of political and economic realignment — because of that revolution. This is a new industrial revolution. Are we going to be better off a hundred years from now, when the hard work is done for us? I believe so. Is the transition going to be easy? I don’t know. But it’s progress, and progress encompasses moments of realignment in how we look at technology and how we adapt to it.

James Kaplan: Terrific. Thank you so much. This was great.

Marco Palladino: Thanks for the opportunity. I had a blast.

Thanks for reading Prosaic Times — subscribe to get every issue!

Discussion about this video

User's avatar

Ready for more?