Introduction
Choosing the right VM size is one of the most important decisions you make when deploying a workload to the cloud. VM sizing is the process of matching CPU, RAM, and storage resources to the actual needs of your application so you get stable performance without paying for capacity you do not use. On Raff, this decision matters from day one because the platform offers both General Purpose and CPU-Optimized virtual machines, which means you can choose between lower-cost shared compute and dedicated compute depending on the workload.
Many teams size servers based on instinct, not evidence. That usually leads to one of two bad outcomes. The first is under-provisioning, where the application becomes slow, unstable, or unavailable under normal traffic. The second is over-provisioning, where the service runs fine but costs more than necessary every month. Good sizing is the discipline of avoiding both extremes.
This guide explains how to choose the right VM size by looking at the three core infrastructure resources first: compute, memory, and storage. Then it shows how different workload types consume those resources, how to recognize the signs of a bad fit, and how to choose between Raff General Purpose and CPU-Optimized plans. By the end, you will have a clear framework for selecting a starting tier, validating your decision after deployment, and resizing with confidence as your workload grows.
Understanding the Three Resources That Define a VM
Every virtual machine is built from the same foundation: CPU, RAM, and storage. Networking also matters, but for most first-pass sizing decisions, these three resources determine whether your application feels fast and stable or strained and unreliable.
CPU: How Much Work Your Server Can Process
CPU determines how much computation your VM can handle at the same time. In practical terms, CPU affects how many requests your application can process concurrently, how quickly workers can execute background jobs, and how much headroom you have during traffic spikes.
CPU pressure usually shows up in a few common ways. Pages load slowly when traffic increases. Build jobs take much longer than expected. Application workers fall behind on queues. Database queries that were acceptable at low traffic become noticeably slower. If your workload regularly performs compute-heavy work such as image processing, code compilation, video encoding, search indexing, or high-concurrency API handling, CPU is often the first resource to size carefully.
This is also where the difference between shared and dedicated compute matters. Shared CPU plans are often a good fit for small sites, staging servers, internal tools, and development environments. Dedicated CPU plans are a better fit for production applications that need consistent performance, especially when response time matters under sustained load.
RAM: The Resource That Prevents Slowdowns
RAM is the fast working memory your operating system and applications use while they run. Web servers use RAM for caching, application runtimes use it for active processes and objects, and databases depend heavily on memory for indexes, buffers, and query performance.
When a VM does not have enough RAM, performance degrades quickly. Linux will try to reclaim memory from caches, then begin swapping to disk if needed. Once that happens, even a fast NVMe disk cannot fully compensate for the latency penalty. Applications may restart, the kernel may kill processes, or the server may become intermittently unresponsive.
Memory-heavy workloads include relational databases, in-memory data stores, large application runtimes, analytics services, and container hosts running multiple services. For these workloads, extra RAM often produces a larger real-world improvement than adding more CPU.
Storage: Capacity and I/O Performance
Storage decisions are about both size and speed. Capacity answers how much data your workload needs to store. Speed answers how fast the workload can read and write that data. This distinction matters because small static websites barely touch disk after startup, while databases, log-heavy applications, build systems, and media-serving platforms depend on disk I/O constantly.
Raff uses NVMe SSD storage across its VM tiers, which is especially useful for workloads that perform frequent random reads and writes. Faster storage improves database responsiveness, reduces queue latency for disk-backed jobs, shortens boot and deployment times, and helps applications recover faster after restarts.
When planning storage, think beyond the operating system and the application itself. Include database growth, uploaded files, package caches, container images, logs, backups stored locally before offloading, and temporary build artifacts. Teams often size storage too tightly and then discover that the real problem is not CPU or memory but simply running out of disk.
Match the VM to the Workload Pattern
The best way to size a VM is to think in workload patterns rather than brand names or frameworks. A WordPress site, a Django app, and a Node.js API can all have very different infrastructure needs depending on traffic, plugins, caching, and database behavior. What matters is how the workload uses resources.
Small Web Sites and Landing Pages
Simple websites, brochure sites, and low-traffic CMS deployments usually need predictable uptime more than raw compute. These workloads often spend most of their time waiting for requests rather than consuming CPU continuously. They are usually a strong fit for a smaller plan, especially if caching is enabled and the database is modest.
For this type of workload, start small and prioritize clean architecture over large infrastructure. A lightweight server that uses Nginx effectively, serves static assets efficiently, and keeps plugins under control often performs better than a larger server with a bloated application stack.
Application Servers and APIs
Modern web applications usually need a more balanced profile. They may have a web server, an application runtime, background workers, and a database connection pool all competing for resources. These stacks often benefit from at least moderate RAM and enough CPU to absorb concurrency.
If the application receives traffic throughout the day, performs authentication, sends emails, generates reports, or calls external APIs, a dedicated CPU plan is often easier to operate in production because it reduces variability. This is especially true for SaaS products, customer portals, and internal business systems that need stable user-facing performance.
Databases
Databases deserve separate treatment because they are often memory-sensitive first and storage-sensitive second. CPU matters too, but many database slowdowns are caused by insufficient RAM for indexes and cache buffers, or by slow disk performance under read and write pressure.
For MySQL or PostgreSQL, you should usually avoid treating the database as an afterthought. If the database shares a VM with the application, budget extra memory from the start. If the workload is transactional, read-heavy, or expected to grow steadily, choose a CPU-Optimized plan sooner rather than later. Consistency matters more for databases than for most front-end workloads.
CI/CD, Build Runners, and Compute Jobs
Build systems, test runners, video processing, import pipelines, and background workers are typically CPU-bound. These workloads benefit directly from dedicated compute because the value of the service comes from completing work fast and predictably. A shared CPU plan may still be acceptable for low-priority jobs, but a production CI pipeline or frequent automation runner usually justifies CPU-Optimized resources.
Development and Staging Environments
Development and staging servers are ideal places to save money with right-sizing. They need enough capacity to mirror production behavior meaningfully, but not necessarily the same scale. General Purpose plans are a natural fit here because they keep costs low while still providing enough RAM and NVMe storage for realistic testing.
A Practical Sizing Matrix
The table below gives a starting point for common workload categories. It is not a universal truth, but it is a reliable planning baseline.
| Workload | Recommended starting profile | Why it fits | Raff direction |
|---|---|---|---|
| Personal site, docs site, low-traffic blog | 1-2 vCPU, 1-2 GB RAM, 25-50 GB SSD | Light concurrency, small database, low background processing | CPU-Optimized Tier 1-2 or smallest practical shared environment |
| WordPress or small business site | 1-2 vCPU, 2-4 GB RAM, 50-80 GB SSD | PHP workers and database need more RAM than static sites | CPU-Optimized Tier 2-3 |
| Small web app or API | 2 vCPU, 4 GB RAM, 80 GB SSD | Balanced compute and memory for app runtime plus database activity | CPU-Optimized Tier 3 |
| Internal tools or staging server | 2 shared vCPU, 4 GB RAM, 50 GB SSD | Lower cost, acceptable performance variability | General Purpose 2 vCPU / 4 GB |
| Busy CMS, ecommerce starter, moderate SaaS app | 4 vCPU, 8 GB RAM, 120 GB SSD | More concurrent requests, larger DB, heavier caches | CPU-Optimized Tier 4 |
| Dedicated database server | 4 vCPU, 8 GB RAM or higher, 120+ GB SSD | Memory and storage performance are critical | CPU-Optimized Tier 4+ |
| Container host with several services | 4-8 vCPU, 8-16 GB RAM, 120-180 GB SSD | Multiple processes compete for memory and compute | CPU-Optimized Tier 4-5 |
| CI/CD runner, encoding, automation workers | 4-8 vCPU, 8-16 GB RAM | CPU-heavy, often parallel workloads | CPU-Optimized Tier 4-5 |
The goal of this matrix is not to push every workload upward. It is to help you choose a safe baseline. A right-sized server is not the largest server you can afford. It is the smallest server that supports your workload with healthy headroom.
Choose Between General Purpose and CPU-Optimized
One of the most useful decisions Raff gives you is not just how large the VM should be, but what kind of compute profile it should have.
General Purpose VMs
General Purpose VMs use shared CPU resources. That makes them attractive for cost-sensitive workloads where occasional performance variation is acceptable. They are well suited to development environments, small websites, staging servers, admin tools, and lightweight application stacks.
Use a General Purpose plan when your workload is:
- Not latency-sensitive all the time
- Light to moderate in average CPU use
- Easy to resize later
- Better optimized for cost than for absolute consistency
CPU-Optimized VMs
CPU-Optimized VMs provide dedicated vCPUs. That makes them a stronger choice for production systems where compute consistency affects user experience or operational throughput.
Use a CPU-Optimized plan when your workload is:
- Customer-facing and performance-sensitive
- Running a database or queue workers
- Handling sustained traffic or parallel jobs
- Expected to scale steadily
- Serving as CI/CD infrastructure or automation runners
A useful rule is simple: choose General Purpose when cost efficiency matters most and the workload is forgiving, then move to CPU-Optimized when the workload becomes production-critical or compute-sensitive.
How to Tell That Your VM Is the Wrong Size
The best sizing decisions are validated after deployment. Even a careful estimate is still an estimate. Once the application is running, look for evidence.
Signs the VM is Too Small
A VM is probably undersized if you see:
- Sustained CPU usage near saturation during ordinary traffic
- Frequent memory pressure or swap usage
- Slow response times during modest load
- Background jobs piling up
- Database latency increasing sharply at peak hours
- Disk space dropping too close to zero
- Reboots or process kills caused by memory exhaustion
Undersizing is expensive in a hidden way. The monthly bill is lower, but the operational cost shows up as lost performance, firefighting, customer frustration, and delayed releases.
Signs the VM is Too Large
A VM may be oversized if:
- CPU usage stays very low most of the time
- RAM is mostly unused even at peak
- Growth has slowed but the infrastructure footprint never changed
- You added headroom for a launch or migration that already passed
- The application stack is simple but the VM profile stayed enterprise-sized
Oversizing is less dramatic than undersizing, but it erodes efficiency month after month. Right-sizing is cost control without service degradation.
A Safe Process for Right-Sizing Over Time
Cloud providers broadly recommend choosing instance sizes based on observed workload requirements and revisiting those decisions as usage data accumulates. AWS describes right sizing as matching instance types and sizes to workload requirements at the lowest cost, while Google Cloud and Azure both provide machine-size guidance based on actual resource needs and observed utilization. :contentReference[oaicite:1]{index=1}
A practical process looks like this:
- Start with the smallest tier that gives you enough headroom for launch.
- Monitor CPU, RAM, disk usage, and disk growth under real traffic.
- Review behavior during peak periods, not only averages.
- Resize when the evidence is clear, either upward for stability or downward for savings.
- Re-check after meaningful application changes such as a new feature, plugin set, background job, or traffic campaign.
This process works especially well on Raff because you can resize instead of rebuilding from scratch. That changes sizing from a one-time gamble into an iterative operational decision.
Raff-Specific Sizing Recommendations
Raff’s product structure makes sizing easier because the tiers are clear and the trade-offs are easy to understand.
When to start with General Purpose
Start with General Purpose if you are launching:
- A staging or QA server
- A development environment
- A small website with moderate traffic
- A low-priority internal dashboard
- A proof of concept or MVP
The entry General Purpose tier begins at 2 shared vCPUs, 4 GB RAM, and 50 GB NVMe SSD for $4.99/month, which is a strong fit for cost-conscious deployments that still need enough memory for modern Linux stacks.
When to start with CPU-Optimized
Start with CPU-Optimized if you are launching:
- A production web application
- A transactional database
- A queue-heavy backend
- A CI/CD runner
- A workload where predictable latency matters
The entry CPU-Optimized tier begins at 1 vCPU, 1 GB RAM, and 25 GB NVMe SSD for $3.99/month, while Tier 3 provides 2 vCPU, 4 GB RAM, and 80 GB NVMe SSD for $19.99/month. That Tier 3 profile is often the most practical baseline for small production applications because it balances compute, memory, and storage without excessive cost.
Sample Starting Points on Raff
Here is a simple shortlist:
- Simple landing page or small static site: CPU-Optimized Tier 1 or Tier 2
- Small CMS or WordPress deployment: CPU-Optimized Tier 2
- Production web app or API: CPU-Optimized Tier 3
- Moderate ecommerce or larger CMS: CPU-Optimized Tier 4
- Database-focused workload: CPU-Optimized Tier 4 or higher
- Dev or staging environment: General Purpose 2 vCPU / 4 GB or 4 vCPU / 8 GB depending on team size
These are starting points, not permanent commitments. The best long-term tier is the one your monitoring confirms.
Best Practices for Making the First Sizing Decision
To make a good first decision, keep these principles in mind:
- Size for the application you have, not the one you imagine. It is better to resize later than to pay for speculative growth too early.
- Treat databases separately. Databases often need more memory and more predictable performance than the web layer.
- Leave room for the operating system and background processes. Do not size only for the main application process.
- Plan storage growth explicitly. Logs, uploads, backups, and containers consume disk faster than teams expect.
- Use smaller environments to learn. Development and staging can reveal where your real bottleneck is before production traffic exposes it.
- Prefer consistency for production. Dedicated CPU is often worth it once the workload affects customers or revenue.
Conclusion
Choosing the right VM size is about fit, not excess. A well-sized VM has enough CPU for concurrency, enough RAM to avoid pressure, and enough NVMe storage for both current data and short-term growth. The right choice also depends on workload shape: development and staging environments usually reward lower-cost shared compute, while databases, production applications, and CI/CD systems often benefit from dedicated CPU.
If you are unsure where to begin, choose a conservative starting tier, deploy, and measure. That is the most reliable path to a stable and cost-efficient environment. Raff makes that approach practical with clear VM categories, instant resize, and pricing that scales from lightweight environments to serious production workloads.
From here, the next useful step is to pair sizing with architecture. Once you know how large the VM should be, the next questions become how to back it up, how to secure it, and when to separate the database from the application layer. Those are the decisions that turn a functioning server into a resilient production system.