Home/resources/private-ai-infrastructure-new-zealand
Resources

PrivateAIInfrastructureNewZealand

What private AI infrastructure is, how it works, and what NZ organisations need to deploy sovereign AI inside their own environment.

Executive Summary

CEO / Board

Private AI infrastructure means your AI runs on hardware you control — not on servers owned by Google, Microsoft, or OpenAI. This is the only way to guarantee that community or cultural data never leaves your environment.

CIO / CTO

Modern private AI uses open-source models (Llama, Mistral, Phi) with retrieval-augmented generation (RAG) — no need to train a model from scratch. A well-configured private deployment rivals cloud AI for organisation-specific knowledge tasks.

Finance / Procurement

Private AI has higher upfront infrastructure cost than cloud subscriptions. However, it eliminates per-seat licensing costs at scale and the compliance/risk overhead of cloud AI. Total cost of ownership over 3–5 years is often comparable.

Operations

Private AI can run without an internal technical team. Sovata provides fully managed operations — monitoring, updates, security, and support — so your team uses the AI system without maintaining it.

Discuss infrastructure optionsSee Sovata's architecture →

What is private AI infrastructure?

Private AI infrastructure is the computing environment in which an organisation's AI systems run — where the hardware, software, and data are owned or exclusively controlled by the organisation, rather than shared with other customers on a public cloud platform.

When you use Microsoft Copilot, ChatGPT Enterprise, or Google Gemini, your queries and documents are processed on infrastructure owned by those companies. Private AI inverts this: the AI runs inside your environment, on infrastructure you control. Your data never leaves.

For New Zealand organisations with data sovereignty requirements, private AI infrastructure is not a preference — it is a prerequisite. It is the only architectural approach that makes data sovereignty a technical guarantee rather than a contractual promise.

The three deployment models

On-premises

Maximum control

AI systems run on hardware physically located within your organisation's premises or a private data centre you operate. This provides complete physical and logical control over the infrastructure but requires your organisation (or a managed service provider) to handle infrastructure operations.

Advantages

Complete physical control
No dependency on external providers
Lowest ongoing operational cost at scale
Easiest to demonstrate to regulators and auditors

Considerations

Higher upfront hardware investment
Requires physical space, power, and cooling
Internal (or managed) maintenance responsibility
Hardware refresh cycle every 3–5 years

Private NZ cloud

Recommended for most

AI runs on dedicated infrastructure in a New Zealand data centre, operated by a NZ hosting provider but allocated exclusively to your organisation. You get NZ data residency and managed infrastructure without the capital cost of your own hardware.

Advantages

NZ data residency with managed operations
Lower upfront cost vs on-premises hardware
Scalable without hardware refresh cycles
Physical security handled by NZ data centre

Considerations

Higher monthly cost than on-premises at scale
Dependency on NZ cloud provider availability
Less physical control than on-premises

Hybrid

For complex requirements

Some AI components run on-premises (typically the most sensitive data processing), while others run in a private NZ cloud (typically less sensitive processing or overflow capacity). This provides fine-grained control at the cost of additional complexity.

Advantages

Tailored sensitivity controls
Flexibility as requirements evolve
Can leverage existing on-premises investment

Considerations

Higher design and operational complexity
Requires careful network architecture
More components to monitor and maintain

How private AI works: retrieval-augmented generation

Most private AI deployments for community organisations use a technique called retrieval-augmented generation (RAG). Understanding this technique is essential to understanding what private AI can and cannot do.

A standard AI model (like ChatGPT) was trained on vast amounts of internet data — it knows a great deal about the general world, but knows nothing about your organisation's specific policies, documents, or operational context.

RAG solves this by adding a search step before the AI generates a response: when a user asks a question, the system first searches your organisation's document collection to find relevant passages, then provides those passages to the AI model as context, and the AI generates a response grounded in your actual content.

The result: an AI that answers questions specifically about your organisation's policies, procedures, and knowledge — accurately, with sources — while remaining entirely within your infrastructure.

What infrastructure do you need?

Sovata's Discovery Call includes a technical assessment of your current environment and a recommendation for the most appropriate private AI infrastructure model for your organisation's needs and budget.

Book a Discovery Call

What components make up a private AI system?

Compute layer

GPU or CPU hardware that runs the AI model inference. For most knowledge assistant deployments, one to four GPU nodes is sufficient.

Storage layer

Where your documents, knowledge base, embeddings database, and audit logs are stored. Must be within your controlled infrastructure.

AI model

The language model that generates responses. Open-source models (Llama, Mistral, Phi, Qwen) provide strong performance without licensing fees.

Embedding model

A smaller model that converts documents and queries into numerical representations for semantic search. Typically runs efficiently on CPU.

Vector database

Stores document embeddings for fast semantic search. Examples include Qdrant, Weaviate, or pgvector. Runs within your infrastructure.

Orchestration layer

Manages request routing, model selection, tool use, and system integrations. This is where governance controls and guardrails are applied.

Guardrails & safety

Input validation, PII detection, content filtering, and output validation components that ensure the AI operates within approved parameters.

Governance & monitoring

Audit logging, usage monitoring, performance dashboards, and alert systems that maintain visibility and accountability.

Cost considerations for NZ organisations

Private AI infrastructure costs should be evaluated in full context, not just compared to a cloud subscription headline price.

Cost categoryCloud AIPrivate AI (managed)
Upfront costLow (subscription)Medium–High (infrastructure)
Ongoing licencePer-seat (scales with users)None (open-source models)
Managed operationsSelf-managedIncluded (Sovata Managed Support)
Compliance overheadHigh (PIA, legal review ongoing)Low (architectural guarantee)
Data breach risk costSignificantMinimal
3-year total (est.)Comparable or higher at scaleComparable or lower at scale

Frequently asked questions

What is private AI infrastructure?

Private AI infrastructure refers to the computing environment in which an organisation's AI systems run — where the hardware, software, and data are owned or exclusively controlled by the organisation itself, rather than shared with other customers on a public cloud. This includes on-premises servers, dedicated private cloud instances, and NZ-hosted managed private cloud environments.

What hardware is needed to run AI privately?

The hardware requirements depend on the AI models used and the scale of deployment. For knowledge assistant applications using retrieval-augmented generation (RAG) with smaller open-source models, a modest GPU server (e.g. NVIDIA RTX 4090 or A100 class) can handle dozens of concurrent users. For larger deployments or more powerful models, multiple GPU nodes or a GPU cluster may be needed. CPU-only inference is possible for some smaller models with lower performance requirements.

Can we run AI on our existing servers?

Possibly, depending on your existing infrastructure. Most modern AI deployments require at least one GPU for acceptable performance, though some lighter models can run on CPU-only hardware with acceptable latency for lower-traffic use cases. A technical assessment of your existing infrastructure is the best way to determine what can be reused and what needs to be added.

What are the main deployment models for private AI in NZ?

The three main deployment models are: (1) On-premises — AI runs on hardware physically located in your office or data centre, providing maximum control but requiring internal infrastructure management; (2) Private NZ cloud — AI runs on dedicated infrastructure in a New Zealand data centre managed by a NZ hosting provider, providing NZ data residency with managed infrastructure; (3) Hybrid — some components on-premises, others in a private NZ cloud, balancing control and operational convenience.

Are there NZ-based AI hosting options?

Yes. Several New Zealand data centre and cloud providers can host private AI infrastructure, including Datacom, 2degrees Business, Spark, and specialist hosting providers with NZ data centre presence. Sovata works with clients to select NZ hosting options appropriate to their sovereignty, performance, and budget requirements.

How much does private AI infrastructure cost?

Costs vary significantly by scale and deployment model. A basic on-premises setup for a small organisation can start from $15,000–$40,000 NZD in hardware, plus setup costs. A private NZ cloud deployment can cost $2,000–$8,000 NZD per month depending on compute requirements. These costs need to be weighed against the total cost of cloud AI (per-seat fees, data exposure risk, compliance overhead) and the savings from not hiring an internal AI team.

Do we need our own technical team to run private AI?

Not with Sovata. Our Managed Support service means we handle all technical operations — infrastructure monitoring, model updates, security patches, performance optimisation, and incident response. Your organisation receives a working AI system without needing internal AI infrastructure expertise.

What AI models can run on private infrastructure?

Many high-quality AI models can be deployed privately, including open-source models like Llama, Mistral, Phi, and Qwen. For knowledge assistant applications using retrieval-augmented generation (RAG), these models perform well on organisation-specific content. For applications requiring the most advanced general capabilities, private deployments can also connect to external API-based models with explicit, auditable authorisation for specific use cases.

How does private AI performance compare to cloud AI?

For organisation-specific knowledge assistant applications — the primary use case for most NZ community organisations — private AI can match or exceed cloud AI performance because it is tuned specifically for your content. For general-purpose tasks requiring broad world knowledge, the largest cloud AI models (GPT-4o, Claude 3.5 Sonnet) maintain an edge. The key insight: for answering questions about your organisation's policies, processes, and knowledge base, a well-configured private AI will outperform a generic cloud AI that knows nothing about your organisation.

How do we keep private AI models up to date?

Model updates are a key component of managed AI operations. Sovata's Managed Support service handles model evaluation, testing, and updates — assessing new model releases for suitability, testing them against your knowledge base and use cases, and deploying updates with appropriate governance controls. The knowledge base (your documents and policies) is updated separately, typically through an admin interface that authorised staff can use without technical expertise.

Can private AI integrate with Microsoft 365 or SharePoint?

Yes. Private AI systems can be integrated with Microsoft 365 and SharePoint through controlled API connections, allowing the AI to index and search documents from those systems. Importantly, the integration is configurable — you can specify exactly which SharePoint sites, document libraries, or M365 content the AI can access, and the data still flows through your controlled infrastructure rather than Microsoft's AI systems.

What happens when private AI goes down?

Downtime handling depends on the deployment architecture. Properly designed private AI systems include redundancy, health monitoring, and automated restart capabilities. Sovata's Managed Support includes 24/7 monitoring and defined response times for outages. For critical applications, high-availability configurations with failover can be designed. The service level agreement specifies uptime commitments and response times.

How is private AI data backed up?

Private AI systems require backup strategies for both the AI infrastructure configuration and the knowledge base documents. Configuration backups ensure fast system restoration after hardware failures. Document backups ensure the AI's knowledge base is protected. These backups remain within your controlled infrastructure — they are not sent to external backup services unless you explicitly configure that.

What security requirements apply to private AI infrastructure?

Security requirements include: network isolation (AI systems should not be directly internet-accessible), access control (role-based authentication for users and administrators), encryption at rest and in transit, patch management (operating system, AI framework, and model updates), physical security (for on-premises deployments), and audit logging (all access and system events). These requirements are addressed in Sovata's deployment process.

Can we switch from private AI to cloud AI, or vice versa, later?

Yes. A key principle of Sovata's approach is that your data and AI configuration are yours — you are not locked into a proprietary format or infrastructure. Your documents, knowledge base configuration, and governance policies can be migrated. If you later choose to move to a different infrastructure model, Sovata can support that transition.

What is retrieval-augmented generation (RAG) and why does it matter for private AI?

Retrieval-augmented generation (RAG) is the technical approach that enables private AI knowledge assistants. Instead of relying solely on what a model learned during training, RAG allows the AI to search a specific document collection, retrieve relevant passages, and use them to generate accurate, grounded answers. For private AI, this means the AI's knowledge is entirely drawn from your organisation's own documents — not from internet data or other organisations' content. It also means the AI can only 'know' what you have explicitly included in its knowledge base, making it auditable and controllable.

Find the right private AI infrastructure for your organisation

Sovata designs, deploys, and operates private AI infrastructure for NZ organisations. We handle the technical complexity so you get a working, sovereign AI system — without needing your own infrastructure team.

Book a free Discovery Call

Free · 1 hour · New Zealand-based team

Get Started

ReadytobecomeaFoundingPartner?

A free Discovery Call takes one hour. We'll tell you honestly where AI can help, what it will take, and whether a Founding Partner arrangement is the right fit.

Free · No commitment · One hour · New Zealand-based