IR Playbook: Hunting Automated Probes for Exposed Repositories and Cloud Paths

November 09, 2025 (Last Modified: November 09, 2025)

4n6 Beat

5 min read

Scanners are sweeping web servers for /.git, /.github, GitLab CI, SVN, and S3/AWS paths. This playbook shows how to hunt the requests in...

On November 8, 2025, the SANS Internet Storm Center reported honeypot hits probing common repository and cloud-related paths, including /.git/logs/refs/remotes/origin/main, /.git/objects/info, /.github/* (such as dependabot.yml), /.gitlab/*, /.gitlab-ci, /.git-secret, /.svnignore, and cloud-y paths like /aws/bucket, /s3/backup, /s3/bucket, /s3/credentials (ISC Diary). If any of these return 200s, you may be serving source, CI config, or credentials. The rest of this post walks through a fast, repeatable response.

Intrusion Flow

Recon and probing: Automated clients request telltale repo/CI paths such as /.git/HEAD, /.git/config, .github/*, .gitlab-ci*, .svn/*, or /s3/* looking for misdeployments (PortSwigger, GitHub Docs: dependabot.yml location, GitLab CI YAML).
Exploitation if exposed: If /.git/ is reachable, attackers can reconstruct history via targeted downloads (e.g., /.git/HEAD, refs, objects) or off-the-shelf dumpers (arthaud/git-dumper, GitTools). Advisory sites treat exposed VCS dirs as source disclosure risks (Acunetix on .git).
Post-exploitation: Harvest secrets embedded in history or CI files using secret scanners; leaked tokens often enable cloud pivots (Gitleaks, TruffleHog).
Cloud angle: Attackers also test S3 naming or credential endpoints; your guardrail here is account/bucket-level S3 Block Public Access-on by default for new buckets since April 28, 2023, and recommended broadly (AWS Prescriptive Guidance, S3 BPA user guide, AWS announcement).

Key Artifacts to Pull

Web access logs from the serving tier (reverse proxies, WAFs, app servers):
- NGINX: confirm actual log file and format via access_log and log_format (defaults vary by distro; see NGINX module docs) (nginx log module, admin logging guide).
- Apache HTTPD: Combined Log Format reference and locations configured via CustomLog (Apache docs).
- IIS: W3C logs (fields like cs-uri-stem, cs-uri-query, sc-status) and log storage under W3SVC; field list in Microsoft documentation (Microsoft Learn, W3C logging fields, IIS logging overview).
Server configs for containment validation:
- NGINX: location ~ /\.(?!well-known) { deny all; } is a common pattern to block dotfiles while allowing ACME challenges (Bolt CMS nginx example).
- Apache: <FilesMatch "^\."> Require all denied </FilesMatch> blocks dotfiles (Apache core / ).
Evidence if exposure occurred:
- Sample served files (e.g., /.git/HEAD, /.git/config, .gitlab-ci.yml) for scoping; prefer capturing over the wire evidence and hash it in your case notes.
- HTTP status codes context: 200 means the resource was served; 403 means refused; 404 means not found (MDN 200, MDN 403, MDN 404).

Detection Notes

The goal is to quickly identify requests to risky repo/CI/cloud paths and prioritize 200s.

Quick grep on Linux log bundles:

grep -E '"(GET|HEAD) /(\.git|\.github|\.gitlab|\.gitlab-ci|\.svn|s3|aws)(/|$)' -n -- *.log* |
  grep -E ' 200 | 206 '  # triage the hits that served content

Splunk examples:

index=web sourcetype=access_* (uri_path="/.git*" OR uri_path="/.github*" OR uri_path="/.gitlab*" OR uri_path="/.svn*" OR uri_path="/s3*" OR uri_path="/aws*")
| stats count by uri_path status useragent src_ip
| where status IN (200,206)

Elastic/Lucene (adjust field names):

request:("/.git" OR "/.github" OR "/.gitlab" OR "/.gitlab-ci" OR "/.svn" OR "/s3" OR "/aws") AND (status:200 OR status:206)

Write a Sigma rule for webserver logs (logsource category: webserver) to flag requests to repo/cloud paths and prioritize 200/206. See Sigma spec for logsources and rule structure (Sigma basics: logsources, rule creation guide).

Paths to include in your detection lists (from the honeypot observation):

/.git/logs/refs/remotes/origin/main, /.git/objects/info, /.github/* (e.g., dependabot.yml, ISSUE_TEMPLATE/), /.gitlab/issue_templates, /.gitlab-ci, /.git-secret, /.svnignore, /aws/bucket, /s3/backup, /s3/bucket, /s3/credentials (ISC Diary, GitHub issue templates path, FUNDING.yml in .github, GitLab templates path).

Response Guidance

When the pager goes off, move in this order. We aim to keep toil low even if tooling is brittle.

Rapid triage

Filter for 200/206 responses to the paths above. Treat any successful fetch of /.git/HEAD, /.git/config, .gitlab-ci.yml, .hg/*, or .svn/* as potential source disclosure (Acunetix: .git).
If only 403s are returned, you likely blocked access; still verify server configs to prevent regressions (MDN 403).

Contain

Immediately block dotfiles on your frontends:
- NGINX: add a dotfile deny rule (keep /.well-known allowed for ACME), reload, and re-test (Bolt CMS nginx example).
- Apache: deploy <FilesMatch "^\."> Require all denied </FilesMatch> via vhost or .htaccess and reload (Apache core docs).
For S3, confirm S3 Block Public Access is ON at the account and bucket levels; AWS recommends enabling all four BPA settings unless a specific public use case exists (S3 BPA user guide, AWS Security Hub control S3.8).

Scope

If a repo was reachable, safely acquire a copy for forensics (from evidence or a controlled re-fetch) and run secret scans. Tools: Gitleaks and TruffleHog identify common tokens and can be automated in CI (Gitleaks, TruffleHog).
Review .git/config for embedded HTTP(S) credentials if teams used credential-in-URL patterns (a known pitfall) (TechTarget explainer).

Eradicate and harden

Redeploy without VCS metadata-don’t serve .git, .hg, or .svn in web roots (PortSwigger).
Keep dotfile denies in web server configs as a standing control (Apache core, NGINX logging/config docs for verification).
If secrets were exposed, rotate them and, if necessary, rewrite history (git filter-repo/BFG) knowing that history rewriting does not retroactively “unleak”-rotation is mandatory (BFG cleanup guidance).

Monitor

Keep a lightweight content rule or Sigma detection to alert on repo/CI/cloud path hits returning 200/206 (Sigma basics). Consider periodic external scans of your own properties for .git exposure using safe methods analogous to public tooling (do not scan others without authorization) (GitTools).

Takeaways

Add or verify dotfile denies on all internet-facing web servers today (Apache, NGINX).
Hunt web logs for 200s to repo/CI/cloud paths and ticket any positives (start with the list in the ISC diary) (ISC Diary).
If exposed, capture evidence, scan for secrets, rotate credentials, and redeploy without VCS metadata (Gitleaks, TruffleHog).
For S3, ensure Block Public Access is enforced at account and bucket levels, and route any public content through CloudFront or other controlled patterns (AWS BPA, Security Hub control).

Sources / References

SANS ISC Diary – Honeypot: Requests for (Code) Repositories: https://isc.sans.edu/diary/Honeypot%2BRequests%2Bfor%2BCode%2BRepositories/32460
PortSwigger Web Security Academy – Information disclosure (version control): https://portswigger.net/web-security/information-disclosure/exploiting
Acunetix – GIT Detected exposed: https://www.acunetix.com/vulnerabilities/web/git-detected-exposed/
GitTools – Finder/Dumper/Extractor: https://github.com/internetwache/GitTools
arthaud/git-dumper: https://github.com/arthaud/git-dumper
Gitleaks: https://github.com/gitleaks/gitleaks
TruffleHog: https://github.com/trufflesecurity/trufflehog
TechTarget – Avoid Git repository security risk: https://www.techtarget.com/searchsecurity/answer/How-can-developers-avoid-a-Git-repository-security-risk
AWS Prescriptive Guidance – Prevent public S3 access (ACCT.08): https://docs.aws.amazon.com/prescriptive-guidance/latest/aws-startup-security-baseline/acct-08.html
Amazon S3 – Block Public Access: https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html
AWS News – S3 Block Public Access: https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-amazon-s3-block-public-access/
NGINX – ngx_http_log_module: https://nginx.org/en/docs/http/ngx_http_log_module.html
NGINX – Admin Guide: Logging: https://docs.nginx.com/nginx/admin-guide/monitoring/logging/
Apache HTTP Server – Log Files: https://httpd.apache.org/docs/current/logs.html
Microsoft Learn – Configure Logging in IIS: https://learn.microsoft.com/en-us/iis/manage/provisioning-and-managing-iis/configure-logging-in-iis
Microsoft – W3C Logging: https://learn.microsoft.com/en-us/windows/win32/http/w3c-logging
Microsoft – IIS Logging Overview: https://learn.microsoft.com/en-us/previous-versions/iis/6.0-sdk/ms525410%28v%3Dvs.90%29
Apache HTTP Server – core (<Files>/<FilesMatch>): https://httpd.apache.org/docs/2.4/mod/core.html
Bolt CMS – Example NGINX block for dotfiles: https://docs.boltcms.io/5.2/installation/webserver/nginx
MDN – HTTP 200 OK: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/200
MDN – HTTP 403 Forbidden: https://developer.mozilla.org/he/docs/Web/HTTP/Status/403
MDN – HTTP 404 Not Found: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/404
GitHub Docs – Dependabot config in .github/dependabot.yml: https://docs.github.com/en/code-security/dependabot/working-with-dependabot/dependabot-options-reference
GitHub Docs – Issue templates path: https://docs.github.com/articles/about-issue-and-pull-request-templates
GitHub Docs – FUNDING.yml in .github: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/displaying-a-sponsor-button-in-your-repository
GitLab Docs – CI/CD YAML reference: https://docs.gitlab.com/ci/yaml/
GitLab Docs – Instance template paths (.gitlab/issue_templates): https://docs.gitlab.com/administration/settings/instance_template_repository/
BFG Repo-Cleaner – Remove Sensitive Data: https://bfg-repo-cleaner-demos.github.io/github-help/RemoveSensitiveData.html
Sigma – Log sources and rules: https://sigmahq.io/docs/basics/log-sources.html