Testing and Debugging MCP Servers: A Complete Guide

May 8, 2025 · 10 min read

ToolBoost Engineering Team

Building an MCP server is one thing. Making sure it works reliably is another. This comprehensive guide covers testing strategies, debugging techniques, and tools for building production-ready MCP servers.

Why Testing Matters

Without Tests:

"It works on my machine" 🤷
Breaking changes go unnoticed
Regressions in production
Hard to refactor safely
No confidence in deployments

With Tests:

Catch bugs before deployment ✅
Safe refactoring
Living documentation
Faster development (catch issues early)
Production confidence

Testing Strategy Overview

┌─────────────────────────────────┐
│     Unit Tests (Fast)           │  Test individual functions
├─────────────────────────────────┤
│   Integration Tests (Medium)    │  Test MCP protocol integration
├─────────────────────────────────┤
│  End-to-End Tests (Slow)        │  Test with real AI clients
└─────────────────────────────────┘

Testing Pyramid:

70% Unit Tests
20% Integration Tests
10% E2E Tests

Unit Testing

Test your business logic in isolation.

Setup

npm install --save-dev jest @types/jest ts-jest

jest.config.js:

module.exports = {
  preset: 'ts-jest',
  testEnvironment: 'node',
  testMatch: ['**/__tests__/**/*.test.ts'],
  collectCoverageFrom: [
    'src/**/*.ts',
    '!src/**/*.d.ts'
  ]
};

Example: Testing Tool Logic

src/tools/tasks.ts:

export function validateTask(task: any): string[] {
  const errors: string[] = [];

  if (!task.title || task.title.trim() === '') {
    errors.push('Title is required');
  }

  if (task.title && task.title.length > 200) {
    errors.push('Title must be &lt;= 200 characters');
  }

  if (task.priority && !['low', 'medium', 'high'].includes(task.priority)) {
    errors.push('Priority must be low, medium, or high');
  }

  return errors;
}

export function formatTaskResponse(task: Task): string {
  return `Task: ${task.title}\n` +
         `Status: ${task.status}\n` +
         `Priority: ${task.priority}\n` +
         `Created: ${task.createdAt}`;
}

src/tools/__tests__/tasks.test.ts:

import { validateTask, formatTaskResponse } from '../tasks';

describe('validateTask', () => {
  it('should pass for valid task', () => {
    const task = {
      title: 'Valid task',
      priority: 'medium'
    };

    const errors = validateTask(task);
    expect(errors).toHaveLength(0);
  });

  it('should fail when title is missing', () => {
    const task = { priority: 'medium' };

    const errors = validateTask(task);
    expect(errors).toContain('Title is required');
  });

  it('should fail when title is too long', () => {
    const task = {
      title: 'a'.repeat(201),
      priority: 'medium'
    };

    const errors = validateTask(task);
    expect(errors).toContain('Title must be <= 200 characters');
  });

  it('should fail for invalid priority', () => {
    const task = {
      title: 'Task',
      priority: 'critical'  // Invalid
    };

    const errors = validateTask(task);
    expect(errors).toContain('Priority must be low, medium, or high');
  });
});

describe('formatTaskResponse', () => {
  it('should format task correctly', () => {
    const task = {
      id: '123',
      title: 'Test task',
      status: 'todo',
      priority: 'high',
      createdAt: '2025-05-08T10:00:00Z',
      updatedAt: '2025-05-08T10:00:00Z'
    };

    const formatted = formatTaskResponse(task);

    expect(formatted).toContain('Task: Test task');
    expect(formatted).toContain('Status: todo');
    expect(formatted).toContain('Priority: high');
  });
});

Run tests:

npm test

Testing Database Operations

Use test database or mocking:

Option 1: Test Database

import { Pool } from 'pg';

let testPool: Pool;

beforeAll(async () => {
  testPool = new Pool({
    connectionString: process.env.TEST_DATABASE_URL
  });

  // Create tables
  await testPool.query(`
    CREATE TABLE IF NOT EXISTS tasks (
      id UUID PRIMARY KEY,
      title TEXT NOT NULL,
      status TEXT NOT NULL
    )
  `);
});

afterAll(async () => {
  // Drop tables
  await testPool.query('DROP TABLE tasks');
  await testPool.end();
});

beforeEach(async () => {
  // Clear data between tests
  await testPool.query('DELETE FROM tasks');
});

test('should create task in database', async () => {
  const task = {
    id: '123',
    title: 'Test',
    status: 'todo'
  };

  await testPool.query(
    'INSERT INTO tasks (id, title, status) VALUES ($1, $2, $3)',
    [task.id, task.title, task.status]
  );

  const result = await testPool.query('SELECT * FROM tasks WHERE id = $1', [task.id]);

  expect(result.rows[0].title).toBe('Test');
});

Option 2: Mock Database

import { Pool } from 'pg';

// Mock pg module
jest.mock('pg', () => {
  const mockQuery = jest.fn();
  const mockPool = jest.fn(() => ({
    query: mockQuery,
    connect: jest.fn(),
    end: jest.fn()
  }));

  return { Pool: mockPool };
});

test('should call database with correct query', async () => {
  const pool = new Pool();
  const mockQuery = pool.query as jest.Mock;

  mockQuery.mockResolvedValue({ rows: [{ id: '123', title: 'Test' }] });

  const result = await pool.query('SELECT * FROM tasks WHERE id = $1', ['123']);

  expect(mockQuery).toHaveBeenCalledWith(
    'SELECT * FROM tasks WHERE id = $1',
    ['123']
  );
  expect(result.rows[0].title).toBe('Test');
});

Integration Testing

Test your MCP server's protocol implementation.

Using MCP SDK Test Utilities

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema
} from '@modelcontextprotocol/sdk/types.js';

describe('MCP Server Integration', () => {
  let server: Server;

  beforeEach(() => {
    server = new Server(
      { name: 'test-server', version: '1.0.0' },
      { capabilities: { tools: {} } }
    );

    // Register handlers
    server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: 'create_task',
          description: 'Create a task',
          inputSchema: {
            type: 'object',
            properties: {
              title: { type: 'string' }
            },
            required: ['title']
          }
        }
      ]
    }));

    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name === 'create_task') {
        return {
          content: [{
            type: 'text',
            text: `Created: ${request.params.arguments.title}`
          }]
        };
      }
      throw new Error(`Unknown tool: ${request.params.name}`);
    });
  });

  test('should list available tools', async () => {
    const request = {
      method: 'tools/list',
      params: {}
    };

    const response = await server.handleRequest(request);

    expect(response.tools).toHaveLength(1);
    expect(response.tools[0].name).toBe('create_task');
  });

  test('should call tool successfully', async () => {
    const request = {
      method: 'tools/call',
      params: {
        name: 'create_task',
        arguments: {
          title: 'Test Task'
        }
      }
    };

    const response = await server.handleRequest(request);

    expect(response.content[0].text).toContain('Created: Test Task');
  });

  test('should handle unknown tool error', async () => {
    const request = {
      method: 'tools/call',
      params: {
        name: 'unknown_tool',
        arguments: {}
      }
    };

    await expect(server.handleRequest(request)).rejects.toThrow('Unknown tool');
  });

  test('should validate tool input schema', async () => {
    const request = {
      method: 'tools/call',
      params: {
        name: 'create_task',
        arguments: {
          // Missing required 'title'
        }
      }
    };

    // MCP SDK should validate input
    await expect(server.handleRequest(request)).rejects.toThrow();
  });
});

Testing Resources

import { ListResourcesRequestSchema, ReadResourceRequestSchema } from '@modelcontextprotocol/sdk/types.js';

describe('MCP Resources', () => {
  let server: Server;

  beforeEach(() => {
    server = new Server(
      { name: 'test-server', version: '1.0.0' },
      { capabilities: { resources: {} } }
    );

    server.setRequestHandler(ListResourcesRequestSchema, async () => ({
      resources: [
        {
          uri: 'task:///123',
          name: 'Test Task',
          mimeType: 'application/json'
        }
      ]
    }));

    server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
      return {
        contents: [{
          uri: request.params.uri,
          mimeType: 'application/json',
          text: JSON.stringify({ id: '123', title: 'Test Task' })
        }]
      };
    });
  });

  test('should list resources', async () => {
    const response = await server.handleRequest({
      method: 'resources/list',
      params: {}
    });

    expect(response.resources).toHaveLength(1);
    expect(response.resources[0].uri).toBe('task:///123');
  });

  test('should read resource', async () => {
    const response = await server.handleRequest({
      method: 'resources/read',
      params: { uri: 'task:///123' }
    });

    const data = JSON.parse(response.contents[0].text);
    expect(data.id).toBe('123');
    expect(data.title).toBe('Test Task');
  });
});

End-to-End Testing

Test with real AI clients.

Using MCP Inspector

1. Start your server:

npm run build
node dist/index.js

2. Run MCP Inspector:

npx @modelcontextprotocol/inspector node dist/index.js

3. Manual testing in browser:

Open http://localhost:6274
Test tools interactively
Verify responses
Check error handling

Automated E2E Tests

import { spawn } from 'child_process';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

describe('E2E Tests', () => {
  let client: Client;
  let transport: StdioClientTransport;
  let serverProcess: any;

  beforeAll(async () => {
    // Start MCP server as child process
    serverProcess = spawn('node', ['dist/index.js'], {
      stdio: ['pipe', 'pipe', 'pipe']
    });

    // Create client
    transport = new StdioClientTransport({
      command: 'node',
      args: ['dist/index.js']
    });

    client = new Client(
      { name: 'test-client', version: '1.0.0' },
      { capabilities: {} }
    );

    await client.connect(transport);
  });

  afterAll(async () => {
    await client.close();
    serverProcess.kill();
  });

  test('should create task end-to-end', async () => {
    const result = await client.callTool({
      name: 'create_task',
      arguments: {
        title: 'E2E Test Task',
        description: 'Created in E2E test'
      }
    });

    expect(result.content[0].text).toContain('E2E Test Task');
  });
});

Debugging Techniques

1. Enable Debug Logging

import debug from 'debug';

const log = debug('mcp:server');
const logTool = debug('mcp:tool');
const logError = debug('mcp:error');

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  logTool('Tool called:', request.params.name);
  logTool('Arguments:', request.params.arguments);

  try {
    const result = await handleTool(request);
    logTool('Result:', result);
    return result;
  } catch (error) {
    logError('Error:', error);
    throw error;
  }
});

Run with debug output:

DEBUG=mcp:* node dist/index.js

2. Request/Response Logging

import fs from 'fs/promises';

const logFile = 'mcp-requests.log';

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  // Log request
  await fs.appendFile(logFile, JSON.stringify({
    timestamp: new Date().toISOString(),
    type: 'request',
    method: request.method,
    params: request.params
  }) + '\n');

  try {
    const response = await handleTool(request);

    // Log response
    await fs.appendFile(logFile, JSON.stringify({
      timestamp: new Date().toISOString(),
      type: 'response',
      method: request.method,
      success: true,
      response
    }) + '\n');

    return response;
  } catch (error) {
    // Log error
    await fs.appendFile(logFile, JSON.stringify({
      timestamp: new Date().toISOString(),
      type: 'error',
      method: request.method,
      error: error.message,
      stack: error.stack
    }) + '\n');

    throw error;
  }
});

3. Interactive Debugging

VS Code launch.json:

{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "node",
      "request": "launch",
      "name": "Debug MCP Server",
      "skipFiles": ["<node_internals>/**"],
      "program": "${workspaceFolder}/src/index.ts",
      "preLaunchTask": "tsc: build - tsconfig.json",
      "outFiles": ["${workspaceFolder}/dist/**/*.js"],
      "env": {
        "DATABASE_URL": "postgresql://localhost:5432/test"
      }
    }
  ]
}

Set breakpoints and debug step-by-step.

4. Error Tracking

import * as Sentry from '@sentry/node';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV || 'development',
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  try {
    return await handleTool(request);
  } catch (error) {
    // Capture error in Sentry
    Sentry.captureException(error, {
      extra: {
        tool: request.params.name,
        arguments: request.params.arguments
      }
    });

    throw error;
  }
});

Common Issues and Solutions

Issue 1: "Server not responding"

Debugging:

# Check if server is running
ps aux | grep node

# Check stdout/stderr
node dist/index.js 2>&1 | tee server.log

# Test with simple client
echo '{"method":"tools/list","params":{}}' | node dist/index.js

Common causes:

Server crashed on startup
Syntax error in code
Missing environment variables
Port already in use

Issue 2: "Tool call timeout"

Debugging:

// Add timeout monitoring
const startTime = Date.now();

const result = await handleTool(request);

const duration = Date.now() - startTime;
if (duration > 5000) {
  console.warn(`Slow tool call: ${request.params.name} took ${duration}ms`);
}

Common causes:

Slow database queries
External API timeouts
Large data processing
Missing indexes

Issue 3: "Invalid response format"

Debugging:

// Validate response before returning
import Ajv from 'ajv';

const ajv = new Ajv();
const validateResponse = ajv.compile({
  type: 'object',
  required: ['content'],
  properties: {
    content: {
      type: 'array',
      items: {
        type: 'object',
        required: ['type', 'text'],
        properties: {
          type: { type: 'string', enum: ['text', 'image', 'resource'] },
          text: { type: 'string' }
        }
      }
    }
  }
});

const response = buildResponse();

if (!validateResponse(response)) {
  console.error('Invalid response:', validateResponse.errors);
  throw new Error('Invalid response format');
}

return response;

Testing Best Practices

1. Test Isolation

Each test should be independent:

// ❌ Bad: Tests share state
let userId;

test('create user', () => {
  userId = createUser();
});

test('get user', () => {
  getUser(userId);  // Depends on previous test
});

// ✅ Good: Each test is independent
test('create user', () => {
  const userId = createUser();
  expect(userId).toBeDefined();
});

test('get user', () => {
  const userId = createUser();  // Create own test data
  const user = getUser(userId);
  expect(user).toBeDefined();
});

2. Use Test Fixtures

// tests/fixtures/tasks.ts
export const validTask = {
  title: 'Test Task',
  description: 'Test description',
  priority: 'medium'
};

export const invalidTask = {
  title: '',  // Invalid: empty title
  priority: 'invalid'
};

// In tests
import { validTask, invalidTask } from './fixtures/tasks';

test('should accept valid task', () => {
  const errors = validateTask(validTask);
  expect(errors).toHaveLength(0);
});

3. Test Error Paths

test('should handle database connection error', async () => {
  // Mock database failure
  mockQuery.mockRejectedValue(new Error('Connection failed'));

  await expect(createTask({ title: 'Test' }))
    .rejects
    .toThrow('Connection failed');
});

test('should handle malformed input', async () => {
  const invalidInput = { title: null };

  await expect(createTask(invalidInput))
    .rejects
    .toThrow('Title is required');
});

4. Coverage Targets

# Generate coverage report
npm test -- --coverage

Targets:

Overall: >80%
Critical paths: 100%
Error handling: 100%

Continuous Integration

GitHub Actions

.github/workflows/test.yml:

name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:14
        env:
          POSTGRES_PASSWORD: postgres
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v3

      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '18'

      - name: Install dependencies
        run: npm ci

      - name: Run tests
        run: npm test
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/lcov.info

Production Monitoring

Health Checks

// Add health check endpoint
server.setRequestHandler({ method: 'health' }, async () => {
  // Check database connection
  try {
    await pool.query('SELECT 1');
  } catch (error) {
    return { status: 'unhealthy', database: 'disconnected' };
  }

  // Check other dependencies
  return {
    status: 'healthy',
    database: 'connected',
    uptime: process.uptime()
  };
});

Metrics Collection

import { register, Counter, Histogram } from 'prom-client';

const requestCounter = new Counter({
  name: 'mcp_requests_total',
  help: 'Total number of requests',
  labelNames: ['tool', 'status']
});

const requestDuration = new Histogram({
  name: 'mcp_request_duration_seconds',
  help: 'Request duration in seconds',
  labelNames: ['tool']
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const end = requestDuration.startTimer({ tool: request.params.name });

  try {
    const result = await handleTool(request);
    requestCounter.inc({ tool: request.params.name, status: 'success' });
    return result;
  } catch (error) {
    requestCounter.inc({ tool: request.params.name, status: 'error' });
    throw error;
  } finally {
    end();
  }
});

// Expose metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

Conclusion

Testing and debugging are essential for production-ready MCP servers. Invest time in comprehensive tests and debugging tools to save headaches later.

Key Takeaways:

✅ Write unit tests for business logic
✅ Use integration tests for MCP protocol
✅ Add E2E tests for critical paths
✅ Enable debug logging
✅ Monitor production servers
✅ Automate testing in CI/CD

ToolBoost provides:

Built-in monitoring
Error tracking
Performance metrics
Automated health checks

Deploy with confidence! 🚀

Need help testing your MCP? Contact ToolBoost

Deploy tested MCPs with ToolBoost - monitoring included.

Why Testing Matters​

Testing Strategy Overview​

Unit Testing​

Setup​

Example: Testing Tool Logic​

Testing Database Operations​

Integration Testing​

Using MCP SDK Test Utilities​

Testing Resources​

End-to-End Testing​

Using MCP Inspector​

Automated E2E Tests​

Debugging Techniques​

1. Enable Debug Logging​

2. Request/Response Logging​

3. Interactive Debugging​

4. Error Tracking​

Common Issues and Solutions​

Issue 1: "Server not responding"​

Issue 2: "Tool call timeout"​

Issue 3: "Invalid response format"​

Testing Best Practices​

1. Test Isolation​

2. Use Test Fixtures​

3. Test Error Paths​

4. Coverage Targets​

Continuous Integration​

GitHub Actions​

Production Monitoring​

Health Checks​

Metrics Collection​

Conclusion​

Why Testing Matters

Testing Strategy Overview

Unit Testing

Setup

Example: Testing Tool Logic

Testing Database Operations

Integration Testing

Using MCP SDK Test Utilities

Testing Resources

End-to-End Testing

Using MCP Inspector

Automated E2E Tests

Debugging Techniques

1. Enable Debug Logging

2. Request/Response Logging

3. Interactive Debugging

4. Error Tracking

Common Issues and Solutions

Issue 1: "Server not responding"

Issue 2: "Tool call timeout"

Issue 3: "Invalid response format"

Testing Best Practices

1. Test Isolation

2. Use Test Fixtures

3. Test Error Paths

4. Coverage Targets

Continuous Integration

GitHub Actions

Production Monitoring

Health Checks

Metrics Collection

Conclusion