IR Playbook: Hunting Automated Probes for Exposed Repositories and Cloud Paths
On November 8, 2025, the SANS Internet Storm Center reported honeypot hits probing common repository and cloud-related paths, including /.git/logs/refs/remotes/origin/main, /.git/objects/info, /.github/* (such as dependabot.yml), /.gitlab/*, /.gitlab-ci, /.git-secret, /.svnignore, and cloud-y paths like /aws/bucket, /s3/backup, /s3/bucket, /s3/credentials (ISC Diary). If any of these return 200s, you may be serving source, CI config, or credentials. The rest of this post walks through a fast, repeatable response.
Intrusion Flow
- Recon and probing: Automated clients request telltale repo/CI paths such as
/.git/HEAD,/.git/config,.github/*,.gitlab-ci*,.svn/*, or/s3/*looking for misdeployments (PortSwigger, GitHub Docs: dependabot.yml location, GitLab CI YAML). - Exploitation if exposed: If
/.git/is reachable, attackers can reconstruct history via targeted downloads (e.g.,/.git/HEAD, refs, objects) or off-the-shelf dumpers (arthaud/git-dumper, GitTools). Advisory sites treat exposed VCS dirs as source disclosure risks (Acunetix on .git). - Post-exploitation: Harvest secrets embedded in history or CI files using secret scanners; leaked tokens often enable cloud pivots (Gitleaks, TruffleHog).
- Cloud angle: Attackers also test S3 naming or credential endpoints; your guardrail here is account/bucket-level S3 Block Public Access-on by default for new buckets since April 28, 2023, and recommended broadly (AWS Prescriptive Guidance, S3 BPA user guide, AWS announcement).
Key Artifacts to Pull
- Web access logs from the serving tier (reverse proxies, WAFs, app servers):
- NGINX: confirm actual log file and format via
access_logandlog_format(defaults vary by distro; see NGINX module docs) (nginx log module, admin logging guide). - Apache HTTPD: Combined Log Format reference and locations configured via
CustomLog(Apache docs). - IIS: W3C logs (fields like
cs-uri-stem,cs-uri-query,sc-status) and log storage under W3SVC; field list in Microsoft documentation (Microsoft Learn, W3C logging fields, IIS logging overview).
- NGINX: confirm actual log file and format via
- Server configs for containment validation:
- NGINX:
location ~ /\.(?!well-known) { deny all; }is a common pattern to block dotfiles while allowing ACME challenges (Bolt CMS nginx example). - Apache:
<FilesMatch "^\."> Require all denied </FilesMatch>blocks dotfiles (Apache core/ ).
- NGINX:
- Evidence if exposure occurred:
Detection Notes
The goal is to quickly identify requests to risky repo/CI/cloud paths and prioritize 200s.
- Quick grep on Linux log bundles:
grep -E '"(GET|HEAD) /(\.git|\.github|\.gitlab|\.gitlab-ci|\.svn|s3|aws)(/|$)' -n -- *.log* |
grep -E ' 200 | 206 ' # triage the hits that served content
- Splunk examples:
index=web sourcetype=access_* (uri_path="/.git*" OR uri_path="/.github*" OR uri_path="/.gitlab*" OR uri_path="/.svn*" OR uri_path="/s3*" OR uri_path="/aws*")
| stats count by uri_path status useragent src_ip
| where status IN (200,206)
- Elastic/Lucene (adjust field names):
request:("/.git" OR "/.github" OR "/.gitlab" OR "/.gitlab-ci" OR "/.svn" OR "/s3" OR "/aws") AND (status:200 OR status:206)
- Write a Sigma rule for webserver logs (logsource
category: webserver) to flag requests to repo/cloud paths and prioritize 200/206. See Sigma spec for logsources and rule structure (Sigma basics: logsources, rule creation guide).
Paths to include in your detection lists (from the honeypot observation):
/.git/logs/refs/remotes/origin/main,/.git/objects/info,/.github/*(e.g.,dependabot.yml,ISSUE_TEMPLATE/),/.gitlab/issue_templates,/.gitlab-ci,/.git-secret,/.svnignore,/aws/bucket,/s3/backup,/s3/bucket,/s3/credentials(ISC Diary, GitHub issue templates path, FUNDING.yml in .github, GitLab templates path).
Response Guidance
When the pager goes off, move in this order. We aim to keep toil low even if tooling is brittle.
- Rapid triage
- Filter for 200/206 responses to the paths above. Treat any successful fetch of
/.git/HEAD,/.git/config,.gitlab-ci.yml,.hg/*, or.svn/*as potential source disclosure (Acunetix: .git). - If only 403s are returned, you likely blocked access; still verify server configs to prevent regressions (MDN 403).
- Contain
- Immediately block dotfiles on your frontends:
- NGINX: add a dotfile deny rule (keep
/.well-knownallowed for ACME), reload, and re-test (Bolt CMS nginx example). - Apache: deploy
<FilesMatch "^\."> Require all denied </FilesMatch>via vhost or .htaccess and reload (Apache core docs).
- NGINX: add a dotfile deny rule (keep
- For S3, confirm S3 Block Public Access is ON at the account and bucket levels; AWS recommends enabling all four BPA settings unless a specific public use case exists (S3 BPA user guide, AWS Security Hub control S3.8).
- Scope
- If a repo was reachable, safely acquire a copy for forensics (from evidence or a controlled re-fetch) and run secret scans. Tools: Gitleaks and TruffleHog identify common tokens and can be automated in CI (Gitleaks, TruffleHog).
- Review
.git/configfor embedded HTTP(S) credentials if teams used credential-in-URL patterns (a known pitfall) (TechTarget explainer).
- Eradicate and harden
- Redeploy without VCS metadata-don’t serve
.git,.hg, or.svnin web roots (PortSwigger). - Keep dotfile denies in web server configs as a standing control (Apache core, NGINX logging/config docs for verification).
- If secrets were exposed, rotate them and, if necessary, rewrite history (
git filter-repo/BFG) knowing that history rewriting does not retroactively “unleak”-rotation is mandatory (BFG cleanup guidance).
- Monitor
- Keep a lightweight content rule or Sigma detection to alert on repo/CI/cloud path hits returning 200/206 (Sigma basics). Consider periodic external scans of your own properties for
.gitexposure using safe methods analogous to public tooling (do not scan others without authorization) (GitTools).
Takeaways
- Add or verify dotfile denies on all internet-facing web servers today (Apache, NGINX).
- Hunt web logs for 200s to repo/CI/cloud paths and ticket any positives (start with the list in the ISC diary) (ISC Diary).
- If exposed, capture evidence, scan for secrets, rotate credentials, and redeploy without VCS metadata (Gitleaks, TruffleHog).
- For S3, ensure Block Public Access is enforced at account and bucket levels, and route any public content through CloudFront or other controlled patterns (AWS BPA, Security Hub control).
Sources / References
- SANS ISC Diary – Honeypot: Requests for (Code) Repositories: https://isc.sans.edu/diary/Honeypot%2BRequests%2Bfor%2BCode%2BRepositories/32460
- PortSwigger Web Security Academy – Information disclosure (version control): https://portswigger.net/web-security/information-disclosure/exploiting
- Acunetix – GIT Detected exposed: https://www.acunetix.com/vulnerabilities/web/git-detected-exposed/
- GitTools – Finder/Dumper/Extractor: https://github.com/internetwache/GitTools
- arthaud/git-dumper: https://github.com/arthaud/git-dumper
- Gitleaks: https://github.com/gitleaks/gitleaks
- TruffleHog: https://github.com/trufflesecurity/trufflehog
- TechTarget – Avoid Git repository security risk: https://www.techtarget.com/searchsecurity/answer/How-can-developers-avoid-a-Git-repository-security-risk
- AWS Prescriptive Guidance – Prevent public S3 access (ACCT.08): https://docs.aws.amazon.com/prescriptive-guidance/latest/aws-startup-security-baseline/acct-08.html
- Amazon S3 – Block Public Access: https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html
- AWS News – S3 Block Public Access: https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-amazon-s3-block-public-access/
- NGINX – ngx_http_log_module: https://nginx.org/en/docs/http/ngx_http_log_module.html
- NGINX – Admin Guide: Logging: https://docs.nginx.com/nginx/admin-guide/monitoring/logging/
- Apache HTTP Server – Log Files: https://httpd.apache.org/docs/current/logs.html
- Microsoft Learn – Configure Logging in IIS: https://learn.microsoft.com/en-us/iis/manage/provisioning-and-managing-iis/configure-logging-in-iis
- Microsoft – W3C Logging: https://learn.microsoft.com/en-us/windows/win32/http/w3c-logging
- Microsoft – IIS Logging Overview: https://learn.microsoft.com/en-us/previous-versions/iis/6.0-sdk/ms525410%28v%3Dvs.90%29
- Apache HTTP Server – core (<Files>/<FilesMatch>): https://httpd.apache.org/docs/2.4/mod/core.html
- Bolt CMS – Example NGINX block for dotfiles: https://docs.boltcms.io/5.2/installation/webserver/nginx
- MDN – HTTP 200 OK: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/200
- MDN – HTTP 403 Forbidden: https://developer.mozilla.org/he/docs/Web/HTTP/Status/403
- MDN – HTTP 404 Not Found: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/404
- GitHub Docs – Dependabot config in .github/dependabot.yml: https://docs.github.com/en/code-security/dependabot/working-with-dependabot/dependabot-options-reference
- GitHub Docs – Issue templates path: https://docs.github.com/articles/about-issue-and-pull-request-templates
- GitHub Docs – FUNDING.yml in .github: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/displaying-a-sponsor-button-in-your-repository
- GitLab Docs – CI/CD YAML reference: https://docs.gitlab.com/ci/yaml/
- GitLab Docs – Instance template paths (.gitlab/issue_templates): https://docs.gitlab.com/administration/settings/instance_template_repository/
- BFG Repo-Cleaner – Remove Sensitive Data: https://bfg-repo-cleaner-demos.github.io/github-help/RemoveSensitiveData.html
- Sigma – Log sources and rules: https://sigmahq.io/docs/basics/log-sources.html