Security
Security is not a feature you add at the end. It's a property of every decision you make: how you store a secret, how you validate input, how you configure a container, how you handle a dependency update. Most security incidents don't come from sophisticated attacks; they come from known vulnerabilities that nobody patched, secrets that leaked into git, or input that nobody validated. The automation exists. Use it.
Why this matters
A single leaked API key or unpatched dependency can compromise a client's data, trigger a breach notification, and damage trust that took years to build. S&P's value of Integrity means we take security seriously not because a compliance checklist demands it, but because our clients trust us with their systems and their users' data.
Security is also where Care shows up at scale. Every shortcut (hardcoded credentials, skipped dependency updates, disabled HTTPS in dev) creates a risk that someone else will pay for. The practices in this section make secure defaults the path of least resistance so that doing the right thing is easier than doing the wrong thing.
The standard
Secure coding fundamentals
These apply regardless of language or framework. They map to OWASP Top 10 categories and are the minimum bar for any S&P project.
Input validation
Validate all input at the system boundary (API endpoints, form submissions, webhook receivers, queue consumers. Use schema validation (Zod, class-validator, JSON Schema) rather than manual checks. Reject invalid input early and explicitly. For NestJS DTO validation patterns (class-validator, nested objects, global ValidationPipe), see Backend Reference) DTOs and input validation.
Never trust client input to determine authorization. A user sending { "role": "admin" } in a request body should have no effect. Authorization comes from the session/token, not the payload.
Output encoding
When rendering user-supplied content, encode output for the target context (HTML, URL, SQL, shell). Modern frameworks (React, Next.js) handle HTML encoding by default, but be aware of dangerouslySetInnerHTML, template literals in SQL queries, and string interpolation in shell commands.
Authentication and authorization
- Use established auth providers (Firebase Auth, Auth0, Supabase Auth) rather than rolling your own. Custom auth implementations are the single most common source of critical security bugs.
- Store sessions server-side or use signed, httpOnly, secure cookies. Never store tokens in localStorage, it's accessible to any script on the page (XSS vector).
- Implement authorization checks at the service layer, not just the controller. A missing auth check on one endpoint is an IDOR vulnerability.
- Use RBAC or ABAC (e.g., CASL) for permission management. Avoid hardcoded role checks scattered through the codebase.
Secrets management
- Secrets never go in code, config files, or environment variable defaults. Use a secrets manager (GCP Secret Manager, AWS Secrets Manager, HashiCorp Vault).
.envfiles are for local development only and must be in.gitignore. The.env.examplefile contains placeholder values, never real credentials.- Rotate credentials on a regular cadence and immediately after any suspected exposure.
- CI/CD pipelines fetch secrets at runtime from the secrets manager (see the S&P pipeline's
fetch_secretscommand). Never store secrets as plaintext CI environment variables.
Dependency hygiene
- Run
pnpm auditregularly. The CI pipeline does this automatically, but don't wait for CI to discover a vulnerability you could have caught locally. - Pin dependency versions (exact versions in
pnpm-lock.yaml, not floating ranges). Floating ranges mean your build can change without any code change. - Evaluate new dependencies before adding them. Check: maintenance status, download count, known vulnerabilities, license compatibility. Every dependency is an attack surface.
HTTP security headers
Every production deployment should set these headers. Most can be configured once in the reverse proxy or framework middleware:
| Header | Purpose |
|---|---|
Strict-Transport-Security | Force HTTPS (HSTS) |
Content-Security-Policy | Prevent XSS by restricting script sources |
X-Content-Type-Options: nosniff | Prevent MIME-type sniffing |
X-Frame-Options: DENY | Prevent clickjacking |
Referrer-Policy: strict-origin-when-cross-origin | Limit referrer leakage |
Permissions-Policy | Restrict browser feature access (camera, mic, geolocation) |
In NestJS, use helmet middleware. In Next.js, configure headers in next.config.js. Verify with securityheaders.com.
Rate limiting
Every API exposed to the internet needs rate limiting. Without it, credential stuffing, brute force, and API abuse are trivial to execute.
- Apply global rate limits per IP (e.g., 100 requests/minute for general endpoints).
- Apply stricter limits on sensitive endpoints: login, password reset, OTP verification, registration.
- Return
429 Too Many Requestswith aRetry-Afterheader. - At the infrastructure level, configure Cloud Run concurrency limits or nginx
limit_reqas a second layer.
Rate limiting is not just about security: it also protects your infrastructure from accidental self-DDoS (a frontend bug that loops API calls) and keeps costs predictable on pay-per-request platforms.
For NestJS rate limiting implementation (@nestjs/throttler with multi-tier config), see Backend Reference. Rate limiting.
CORS policy
Configure CORS explicitly per environment. Never use a wildcard origin (*) in production, especially with credentials. Common mistakes: reflecting the request's Origin header verbatim (effectively no CORS), allowing localhost in production, and permitting methods the API doesn't need.
For NestJS CORS configuration examples and the full list of common mistakes, see Backend Reference. CORS configuration.
File upload security
If the project handles file uploads:
- Validate file type using magic bytes (file signature), not just the extension. A
.jpgwith a.exepayload passes extension-based checks. - Set size limits at the infrastructure level (nginx
client_max_body_size, Cloud Run request size) and at the application level. - Store uploads outside the webroot: use cloud storage (GCS, S3) with signed URLs for access. Never serve uploaded files from the same domain as the application.
- Generate a new filename on upload (UUIDv7). Never use the client-provided filename for storage, it's an injection vector (path traversal via
../../etc/passwd). - For projects handling user-generated content at scale, consider malware scanning (ClamAV, GCP DLP API).
API security
Token handling
- Use short-lived access tokens (15-60 minutes) paired with longer-lived refresh tokens. This limits the damage window if an access token is stolen.
- Refresh tokens should be rotated on use (one-time use). A refresh token that's been used twice indicates token theft: invalidate the session.
- Access tokens go in the
Authorizationheader, not in URL parameters (URLs end up in logs, referrer headers, and browser history).
API key scoping
When the project issues API keys for external consumers:
- Scope keys per client and per environment. A staging key must not work in production.
- Support key rotation without downtime: accept both old and new keys during a transition window.
- Log API key usage for auditing. If a key is compromised, you need to know what it accessed.
Webhook verification
When receiving webhooks from third-party services (Stripe, GitHub, Slack):
- Verify the webhook signature using the provider's signing secret. Every major provider supports this.
- Reject webhooks with timestamps older than 5 minutes (replay attack prevention).
- Process webhooks idempotently, the provider may retry delivery, and you'll receive the same event multiple times.
Security logging and audit trail
Don't wait until Phase 3 (Observability) to log security-relevant events. At minimum, log these as structured events from day one:
| Event | What to log |
|---|---|
| Authentication failure | User identifier (email/ID), IP, timestamp, failure reason |
| Permission denied | User ID, attempted resource, attempted action |
| Admin actions | Who did what to which resource, before/after values |
| Data exports | Who exported what, how many records, timestamp |
| Account changes | Password reset, email change, role change, 2FA toggle |
| Rate limit exceeded | IP, endpoint, current count |
Log these to a structured logging pipeline (Pino + GCP Cloud Logging, not console.log). Security logs should be retained longer than application logs: 90 days minimum, ideally 1 year.
Never log secrets, passwords, full tokens, or sensitive PII (credit card numbers, SSN) in any log. Log the presence of these fields ("password": "[REDACTED]"), not their values.
Environment isolation
Production data never flows downstream. This is a hard rule, not a guideline.
- Production credentials cannot access non-production resources and vice versa. Separate service accounts, separate secrets manager paths, separate database instances.
- Never copy production data to staging or development. Use anonymized or synthetic data. If a production dataset is needed for debugging, anonymize it first (strip PII, replace emails, randomize names) and document the process.
- Staging mirrors production infrastructure (same Terraform modules, same services) but with its own credentials, its own database, and its own secrets. The staging environment is also where pen testing happens, see the Penetration Testing section.
- Development environments use local databases (Docker Compose) seeded with fixture data. No connection to any shared environment.
If a developer needs production access for debugging, it goes through an explicit access request with time-limited credentials and an audit log. Standing production access for the full team is an anti-pattern.
Incident first-response
When a security incident is discovered (credential leak, suspicious access, vulnerability exploited), the first hour matters most. Follow this sequence:
- Contain: Revoke the compromised credential, block the IP, disable the affected account. Containment before investigation. You can always re-enable access; you can't un-leak data.
- Notify: Inform the project tech lead and the CTO immediately. Don't wait until you've fully investigated. Use a dedicated Slack channel or direct message, not a public channel.
- Preserve evidence: Don't delete logs, rotate away evidence, or redeploy over the affected version. Capture what you can: timestamps, affected resources, log entries, git history.
- Investigate: Determine scope: what was accessed, how long was the exposure, which users/data are affected.
- Remediate: Fix the root cause (patch the vulnerability, rotate all related credentials, close the access path).
- Document: Write a brief incident report: timeline, root cause, impact, remediation steps, what we'll change to prevent recurrence.
The full incident management process will be detailed in Observability & Incidents. This section covers the security-specific first response.
Automated security pipeline
S&P runs an automated security scan pipeline in CI. This is not optional, every project uses this pipeline or an equivalent. The pipeline runs on every PR and on merges to protected branches, scanning for dependency vulnerabilities, code-level security patterns, leaked secrets, license compliance issues, and container image vulnerabilities.
For the full pipeline architecture, scan stage configuration, tool details, failure modes, and Semgrep/Gitleaks setup, see DevOps Reference -- Security pipeline.
Secrets in code: zero tolerance
If Gitleaks or Trivy detects a secret in your code:
- Don't just delete it and push. The secret is already in git history. Deleting it from the latest commit doesn't remove it from the repository.
- Rotate the credential immediately. Assume it's compromised. Generate a new key/token/password.
- Remove it from git history using
git filter-branchor BFG Repo-Cleaner. This rewrites history, so coordinate with the team. - Add the pattern to
.gitignoreor the project's Gitleaks allowlist (for false positives only: with a comment explaining why).
Penetration testing
S&P conducts penetration testing for client projects where the risk profile warrants it. This is a scheduled activity, not an afterthought.
Staging environment for pen testing:
Every project that requires pen testing must maintain a staging environment that mirrors production:
- Same infrastructure (Cloud Run, Cloud SQL, load balancer) provisioned via the same Terraform modules
- Same application version deployed via the same CI pipeline (the
stagingbranch triggers staging deployment automatically) - Realistic but non-production data (anonymized/synthetic, never copy production data to staging)
- Separate credentials and secrets (staging secrets manager, staging service accounts)
The staging environment serves two purposes: QA validation before production releases, and a safe target for penetration testing that won't impact production users or data.
Pen test process:
- Scope: Define what's in scope (API endpoints, web application, infrastructure, mobile) and what's out (third-party services, shared infrastructure).
- Schedule: Align pen tests with the release cycle. Test after major features land but before they reach production.
- Execute: Use the staging environment. Provide testers with the OpenAPI spec, application architecture, and user accounts at different permission levels.
- Remediate: Triage findings by severity. Critical and high findings block the release. Medium findings get scheduled into the next sprint. Low findings go to the backlog.
- Verify: Re-test after remediation to confirm fixes are effective.
Database security
- Use parameterized queries or ORM-generated queries exclusively. Never interpolate user input into SQL strings.
- Database credentials are per-environment, stored in the secrets manager, and rotated regularly.
- Application database users have minimum required permissions. The app should not connect as a superuser.
- Enable SSL/TLS for all database connections, including from Cloud Run to Cloud SQL (the Cloud SQL Proxy handles this for GCP).
- Audit log sensitive operations (user creation, role changes, data exports) at the application level.
Container security
Every container image deployed by S&P follows these security principles:
- Base images use minimal distributions (
alpine,distroless) to reduce attack surface. - Run containers as a non-root user.
- Multi-stage builds ensure only the compiled output and runtime dependencies end up in the final image. No secrets,
.envfiles, or source code in production images. - Trivy scans every container image in CI. Fix or acknowledge findings before deploying.
For S&P Dockerfile patterns, multi-stage build examples, and Trivy scanning configuration, see DevOps Reference -- Container security.
Critical thinking
Security vs developer velocity
Security tooling that generates hundreds of false positives trains the team to ignore it. Tune your tools:
- Start with the
reportfailure mode when adopting the security pipeline on an existing project. Fix the real issues, allowlist the false positives, then switch tofail. - Semgrep and Trivy rules are configurable. If a rule consistently produces false positives for your stack, suppress it with a documented reason, don't disable the entire scanner.
- Security review is part of code review, not a separate gate. Reviewers should check for the OWASP basics (input validation, auth checks, secret handling) as part of their normal review.
When to bring in specialists
The automated pipeline and secure coding practices handle the baseline. Bring in security specialists (internal or external pen testers) when:
- The project handles PII, financial data, or health records
- The project exposes a public API
- The project is entering a regulated market
- A significant architectural change occurs (new auth system, new data store, new third-party integration)
Dependency updates: the unglamorous essential
Most real-world breaches exploit known vulnerabilities in unpatched dependencies. The security pipeline catches these, but someone has to actually fix them. Assign dependency updates as regular sprint work, not a "when we get to it" backlog item. A weekly 30-minute pass through pnpm audit findings and Trivy results prevents the debt from compounding.
Checklist
For every PR
- No secrets, API keys, or credentials in the code or config files
- User input is validated at the API boundary with schema validation
- Authorization checks exist for every endpoint that accesses user-specific data
- SQL queries use parameterized queries or ORM methods (no string interpolation)
- New dependencies have been evaluated for security, maintenance, and license
-
.env.exampleis updated if new environment variables were added - File uploads validate type (magic bytes), enforce size limits, and store externally
- Webhook handlers verify signatures and handle idempotency
- Security-relevant events are logged (auth failures, permission denied, admin actions)
- No sensitive data (passwords, tokens, PII) in log output
- Security scan pipeline passes (or findings are triaged and documented)
For every project
- Automated security pipeline is configured and running in CI
- Secrets are stored in a secrets manager, not in code or CI environment variables
- HTTP security headers are configured (HSTS, CSP, X-Frame-Options, etc.)
- CORS is configured explicitly per environment (no wildcard origins in production)
- Rate limiting is enabled on public endpoints, stricter on auth endpoints
- Authentication uses an established provider (Firebase Auth, Auth0, etc.)
- Access tokens are short-lived with rotating refresh tokens
- Database connections use SSL/TLS and minimum-privilege credentials
- Container images use minimal base images and run as non-root
- Environments are isolated: production data never flows to staging/dev
- Staging environment exists, mirrors production, and is used for pen testing
- Incident first-response process is known to the team
- Dependency updates are reviewed weekly
AI tips
- Review code for OWASP vulnerabilities. Paste a controller or API handler and ask AI to audit it for injection, broken auth, IDOR, and other OWASP Top 10 issues. AI is good at spotting missing validation, unguarded endpoints, and SQL injection patterns.
- Generate security headers configuration. Describe your deployment setup (NestJS behind Cloud Run, Next.js on Vercel) and ask AI to generate the appropriate security headers configuration. Review the CSP policy: it needs to match your actual script and style sources.
- Triage scan results. Paste Trivy or Semgrep JSON output and ask AI to summarize findings by severity, identify which are actionable vs false positives, and suggest fixes for the actionable ones.
- Write Gitleaks allowlist entries. When a scan flags a false positive (test fixtures, example tokens), ask AI to generate the
.gitleaks.tomlallowlist entry with the right regex and a clear comment. - Generate pen test scope documents. Describe your application architecture and ask AI to draft a pen test scope document listing endpoints, auth flows, and data sensitivity classifications.
Resources
S&P internal:
- Security pipeline implementation. CircleCI security scan pipeline PR
- Backend monorepo template. Includes security pipeline configuration
OWASP:
- OWASP Top 10. Most critical web application security risks
- OWASP Cheat Sheet Series. Practical security guidance by topic
- OWASP Application Security Verification Standard (ASVS). Security requirements framework
Tools:
- Trivy. Vulnerability scanner (dependencies, containers, IaC, secrets, licenses)
- Semgrep. Static analysis for security patterns
- Gitleaks. Secret detection in git repositories
- helmet. Security headers middleware for Express/NestJS
- BFG Repo-Cleaner. Remove secrets from git history
References:
- 12-Factor App (Config) Store config in the environment, not in code
- NIST Cybersecurity Framework. Risk management framework