VIDEO

Why Infrastructure Will Catalyze AI with Meta, Lambda, & Silicon Data

WEKA CMO Lauren Vaccarello hosts women AI leaders at AI Infra Summit for a panel on the future of AI infrastructure.

TL;DR At the AI Infra Summit 2025 Women in AI breakfast panel, WEKA CMO Lauren Vaccarello led a thought-provoking discussion among industry leaders from Meta, Lambda, and Silicon Data about critical AI infrastructure challenges. The discussion covered important themes everyone should be thinking about when building AI infrastructure for the future, including GPU capacity planning strategies, balancing long-term infrastructure investment with operational agility, and cost optimization frameworks. Panelists emphasized that while AI presents unique challenges, fundamental infrastructure principles remain constant. Their advice? Start with “first principles” thinking, measure comprehensively across metrics, maintain passion for continuous learning, and build collaborative teams to navigate this rapidly
evolving landscape.

Below is an edited transcription of the talk. Enjoy!

00:00

Women Leaders in AI Infrastructure: Panel Introduction

Lauren Vaccarello (Chief Marketing Officer at WEKA) kicked off the conversation and introduced the panel of distinguished industry leaders:

  • Rebecca “Bink” Naughton – Leading data center capacity strategy at Lambda, with three decades of experience across Google, Meta, Yahoo, and Microsoft, having built infrastructure for everything from Meta’s first AI research cluster to multi-billion dollar supercomputers
  • Carmen Li – Founder and CEO of Silicon Data and CEO of Compute Exchange, transforming global compute markets through data transparency and creating an independent marketplace for GPU compute trading
  • Elisa Chen – Data scientist at Meta with over five years in AI infrastructure, building the foundation that powers machine learning models serving ads to hundreds of millions of users daily
03:03

AI Infrastructure Challenges: Hardware vs. Software Innovation Speed

The Challenge of Slow Infrastructure vs. Fast Innovation

Speaker: Elisa Chen

Elisa addresses how AI infrastructure often moves slower than technology innovation, outlining several key strategies:

Supply Levers Beyond Incremental Orders:

  • Leverage elastic supply through GPUs as a service
  • Build in-house elastic resources
  • Utilize swap contracts for flexibility
  • Implement dynamic quota allocation based on flexible forecasting and telemetry

Multi-Tiered Planning Approach:

  • Short-term planning: Direct engagement with AI researchers to understand immediate project needs and experimental results
  • Long-term planning: Accounting for organizational big bets and 5-10 year strategic plays based on market trends, emerging hardware, and frontier research
Physical Infrastructure Strategy and Modularity

Speaker: Rebecca “Bink” Naughton

Bink emphasizes the evolving challenge: “You can’t build fast enough to keep up with every single iteration.” Key insights include:

  • Developing modular approaches for right capacity at right time
  • Recognizing that infrastructure will always lag behind hardware, which lags behind applications
  • Understanding that not everyone needs bleeding edge technology all the time
  • Monitoring for headroom opportunities and retuning workloads accordingly
08:29

GPU Capacity Planning and Compute Market Dynamics

Market Dynamics and Flexibility

Speaker: Carmen Li

Carmen highlights the rapid pace of change, noting that new chip releases every six months trigger cascading changes throughout the entire technology stack. She emphasizes:

  • The critical importance of flexibility in infrastructure planning
  • Various contract options: five-year commitments vs. month-to-month vs. on-demand
  • The challenge for startups needing significant compute but unable to secure month-to-month B200 access
Planning in an Uncertain Environment

Speakers: Lauren Vaccarello & Elisa Chen

The panel discusses the tension between needing long-term planning (3-5 years) while maintaining agility for unpredictable near-term needs. Elisa notes that perfect utilization (95%) is “a dream state you will probably never reach,” but idle resources can serve additional use cases:

  • Handling operational spikes
  • Supporting below-the-line projects
  • Running new experiments
  • Building elastic pools for ad hoc workloads
Understanding Peak Load vs. Actual Performance Needs

Speaker: Rebecca “Bink” Naughton

Bink emphasizes the importance of distinguishing between theoretical peak requirements and practical needs: “There’s peak load, and then there’s how fast you need to deliver… And then there’s okay, what can I actually live with.”

Key considerations:

  • Applications and algorithms are continuously evolving
  • Hardware optimizations drive ongoing efficiency improvements
  • Organizations must investigate and interrogate their workloads
  • Understanding how fast computation truly needs to run provides crucial leverage
The Two-Tier GPU Market: Enterprise Underutilization vs. Startup Scarcity

Speaker: Carmen Li

Carmen identifies a two pools of users in the GPU compute market:

Enterprise Challenge – Underutilization:

  • Large Fortune 500 companies experiencing ~60% GPU utilization
  • Exploring monetization of idle GPU clusters
  • Interest in multi-tenant solutions to maximize asset value
  • Facing “the world’s hardest problem” in optimization

Startup Challenge – Supply Shortage:

  • Cannot source compute fast enough
  • Unable to secure month-to-month B200 rentals
  • Significant imbalance in demand-supply curves

Strategic Recommendations:

  • Reserve contracts for known, critical workloads
  • Ensure flexibility with on-demand and multi-cloud options
  • Verify node reliability and spin-up speed
  • Implement transparency before workload transfer
Hardware Quality Verification: A Critical Infrastructure Challenge

Speaker: Rebecca “Bink” Naughton

Bink highlights Lambda’s focus on delivering functional AI hardware, noting it’s “one of the more difficult things to land with quality.” Key insights:

  • Hardware quality significantly impacts the timeline from facility readiness to equipment utilization
  • Ability to mature hardware quickly is “a superpower”
  • Industry-wide challenge requiring continuous attention
Third-Party Verification and Transparency in GPU Markets

Speaker: Carmen Li

Carmen addresses the transparency challenge facing smaller companies and introduces Silicon Data’s verification approach:

The Transparency Problem:

  • Companies self-report cluster performance without third-party verification
  • Donating clusters for month-long testing is costly (lost revenue opportunity)
  • Lack of objective performance data

Machine-Level Verification:

  • On-premises verification using containerized testing
  • Comprehensive performance metrics and UID tracking
  • Performance decay curve analysis
  • Enables bank financing for GPU infrastructure (e.g., refinancing 100-node clusters)
  • Provides objective data for financial and operational decisions

This verification approach helps establish trust and transparency in the GPU compute marketplace.

16:20

AI Model Performance Requirements and Hardware Optimization

User-Centric Performance Needs

Speakers: Lauren Vaccarello & Rebecca “Bink” Naughton

The panel explores how different use cases require vastly different performance levels. Lauren’s example: finding a location in Union Square needs sub-second response, while redesigning a dining room can tolerate longer processing times.

Bink emphasizes a dialog-based approach rather than dictating performance requirements, staying adjacent to customers in major markets to understand and meet their needs.

Model-to-Hardware Mapping

Speaker: Elisa Chen

Elisa suggests user behavior pattern analysis – e.g., who is more or less likely to click on an ad – and provides detailed guidance on matching models to appropriate hardware:

  • Foundation model training: Requires dense GPU clusters
  • Fine-tuning and distillation: Can utilize mid-tier options like A100s
  • Inference: Offers significant flexibility with “a lot of bang for your buck” using less powerful hardware

The key is understanding the trade-off between latency, model performance, and cost.

The Cost-Quality-Latency Triangle

Speaker: Carmen Li

Carmen introduces another dimension: timing flexibility. Batch processing at off-peak times can dramatically reduce costs for workloads that don’t require immediate results.

20:25

AI Infrastructure Cost Management and ROI Measurement

Understanding True Costs

Speakers: Lauren Vaccarello, Carmen Li, Elisa Chen & Rebecca “Bink” Naughton

An audience question reveals many don’t know their actual AI infrastructure spending. The panel discusses the complexity of measuring costs: is it GPUs, is it tokens, is it number of queries:

Hidden and Fixed Costs (Carmen & Elisa):

  • Token costs (input and output)
  • Human engineering costs (often the most expensive)
  • Power and cooling for data centers
  • Infrastructure build-out
Defining ROI in AI

Speakers: Elisa Chen & Carmen Li

The panel acknowledges significant challenges in measuring AI infrastructure ROI:

  • Difficulty in determining what constitutes good ROI
  • Lack of clear understanding of capacity upside
  • Murky cost chains throughout the industry
  • Complexity in choosing appropriate metrics

Carmen cautions that any single metric has blind spots and recommends measuring comprehensively and triangulating across multiple data points.

The Opportunity Cost Framework

Speaker: Rebecca “Bink” Naughton

Bink frames cost as opportunity cost: “When you have a clear value proposition, it becomes much easier to justify what you need and where.” The focus should be on:

  • Understanding what AI will accomplish
  • What the output is worth
  • What you’re prepared to spend
  • Picking the right infrastructure at the right time, in the right place
27:28

Lessons from Tech History Applied to AI Infrastructure: What’s New, What Are Patterns Being Repeated

Pattern Recognition Across Decades

Speakers: Rebecca “Bink” Naughton & Elisa Chen

Bink notes: “History doesn’t repeat itself. It rhymes.” The fundamental pattern of developing applications and improving efficiency remains consistent across 30 years of infrastructure evolution.

However, Elisa highlights unique AI challenges:

  • Lack of robust evaluation frameworks
  • Non-deterministic outputs (same input can produce different outputs)
  • Difficulty determining if models are truly performing as expected
Borrowing from Traditional Markets

Speaker: Carmen Li

Carmen describes Silicon Data’s approach of applying commodity market infrastructure to compute:

Traditional Markets Model:

  • Oil traders use transparent, accessible futures markets
  • No need for direct OPEC contracts
  • Equal access for all participants

Compute Market Reality:

  • Opaque pricing and access
  • Long-term contracts with hyperscalers
  • Penalties for backing out
  • High barriers to entry

Silicon Data is working to normalize GPU configurations and create indexes (published on Bloomberg and Refinitiv) to enable futures and options trading on compute resources.

31:48

Caching Strategies and Memory Optimization for AI Workloads

The Caching Renaissance

Speakers: Lauren Vaccarello & Carmen Li

Lauren draws parallels to early internet days when caching revolutionized web performance. Carmen enthusiastically agrees, noting many companies don’t realize they’re doing unnecessary recomputation.

The Memory Paradox

Speakers: Rebecca “Bink” Naughton, Elisa Chen, & Lauren Vaccarello

Elisa identifies a counterbalancing force: while memory usage is becoming more efficient, long context models for deep research and complex thinking are simultaneously demanding more memory. This creates an ongoing tension between optimization and capability expansion.

The panel agrees infrastructure must accommodate both:

  • Simple scenarios (cacheable queries like location lookups)
  • Complex scenarios (multi-turn conversations requiring extensive context)

Bink notes this challenge has remained constant across different computing eras.

36:16

Building Competitive Advantage Through AI Infrastructure

Better, Faster, Cheaper, Consistent, and Predictable

Speaker: Rebecca “Bink” Naughton
The industry is still in the trial-and-error phase with a lot of experimentation.

Transparency as Foundation

Speaker: Carmen Li

With no one having definitive answers in this nascent industry, Carmen emphasizes transparency as key:

  • Mindful tracking of metrics and workflows
  • Honest assessment of system utilization
  • Understanding current position and weaknesses
  • Creating pathways for improvement
Chasing the Bottleneck

Speakers: Lauren Vaccarello, Carmen Li, Rebecca “Bink” Naughton, & Elisa Chen

Bink describes performance optimization as perpetual bottleneck hunting: “What is true one day or even one hour [can change]. You make a tweak and adjustment… and it may be something else.”

The panel’s recommendations (Elisa):

  • Establish robust data foundations
  • Implement comprehensive telemetry
  • Hire the right data analysts
  • Build out the funnel and lifecycle views to identify changing bottlenecks
The Global Supply Chain Game

Speaker: Carmen Li

Carmen notes that bottlenecks vary by geography and time:

  • Energy constraints in America, but not in other countries
  • But network/switch limitations in other countries
  • Packaging bottlenecks in specific regions
  • Cascading impacts throughout supply chains

She frames it as “a giant game” requiring constant monitoring and adaptation.

40:56

Closing Thoughts: Expert Advice for Building Scalable AI Infrastructure for the Future

Start With First Principles

Speakers: Rebecca “Bink” Naughton & Lauren Vaccarello

Bink: “Experimentation here is key. Experimentation at a reasonable value.” Emphasizes:

  • Understanding what you need from the activity
  • Having a clear thesis about desired outcomes
  • Testing at smaller scales for viable inputs
  • Iterating based on results

Lauren: Advocates for “first principles thinking”:

  • Define what you’re trying to accomplish
  • Determine what “good” looks like
  • Run small tests
  • Rinse and repeat
  • Scale up gradually
Passion and Continuous Learning

Speaker: Carmen Li

Carmen: “Be obsessed. Love what you do.” Key insights:

  • Passion drives continuous learning
  • The industry rewards those who stay current
  • Outdated skills become obsolete quickly (“outdated in month six”)
  • Continuous learning creates sustainable competitive advantage
  • The industry “weeds out people very quickly” who don’t adapt
Building the Right Team and Collaboration Models

Speakers: Elisa Chen & Lauren Vaccarello

Elisa: Emphasizes hiring curious, passionate people but also establishing effective collaboration:

  • AI researchers and data scientists
  • Technical program managers
  • Business capacity planners
  • Vendors and partners
  • Creating smooth operations where “we don’t want the bottleneck to be people”

Lauren: Reinforces that success isn’t about being the smartest person in the room alone:

  • Build teams across different functions
  • Network with peers in the industry
  • Bounce ideas off the community
  • Leverage collective knowledge and experience

Lauren concludes the panel by emphasizing that “this is the first minute of the first day—we are building the future of AI together.” The panel’s collective wisdom points to leveraging community, maintaining curiosity, and working collaboratively to push the boundaries of what’s possible in AI infrastructure.

Like This Discussion? There’s More!

Lauren Vaccarello has been all over the world the last few months talking about the future of AI infrastructure. If you want to learn more about token economics and the dramatic impact it will have on AI builders, check out Lauren’s keynote from World Summit AI in San Francisco: