Merged
Conversation
Frontier Model Security Review
Final Verdict: WARN Claude Opus 4.6 Analysis{"verdict": "PASS","issues":[],"reasoning":"This skill is a detailed but benign social media content automation template. It reads from Slack channels (explicitly READ ONLY), drafts social media posts, runs them through an AI-writing style detector, and saves drafts for human review. Key observations: 1) No prompt injection: There are no hidden instructions, 'ignore previous' patterns, encoded commands, or attempts to override system prompts. The skill is straightforward in its intent. 2) No dangerous tool usage: The skill does not invoke Bash, access the filesystem, or use any destructive commands. It references MCP registry search (mcp__mcp-registry__search_mcp_registry) only to check Slack connectivity, which is a safe read-only operation. 3) No data exfiltration: There are no external URLs, no network calls to unknown endpoints, no credential harvesting. The only external services mentioned (Typefully, Notion, Buffer, etc.) are standard social media/productivity tools configured by the user themselves via placeholders. UTM parameters are appended to the user's own content links, not exfiltration vectors. 4) No social engineering: The skill is transparent about what it does. It explicitly states 'No auto-publishing, ever' and 'never write to Slack.' It queues drafts for human approval. 5) Allowed tools are appropriately scoped: Only MCP registry search is explicitly called. The rest is content generation and formatting work done by the LLM itself. The skill is lengthy but that's because it contains detailed editorial guidelines, content bucket definitions, voice guidelines, and an extensive AI-writing detection checklist — all legitimate content quality measures. The prompt appears truncated at the end (mid-sentence about filler words) but this is just an incomplete paste, not a security concern."}GPT-5.2 Analysis{
"verdict": "WARN",
"issues": [
{
"category": "Overly permissive allowed-tools",
"severity": "medium",
"evidence": "Instructs use of mcp__mcp-registry__search_mcp_registry to discover connectors and then implies use of Slack + Typefully + Notion + web search connectors without an explicit allowlist or tight scoping.",
"risk": "If the runtime has additional MCP tools/connectors available (e.g., filesystem, email, HTTP), a compromised or modified prompt/run could pivot into broader access than intended."
},
{
"category": "Data exfiltration / privacy",
"severity": "medium",
"evidence": "Workflow directs scanning Slack channels (potentially containing confidential info) and drafting public social posts. It also suggests saving drafts to third-party tools (Typefully) and optionally logging into Notion.",
"risk": "Accidental leakage of internal/confidential Slack content, PII, customer data, security details, or unreleased product info to external services or into drafts that later get published."
},
{
"category": "Dangerous tool usage",
"severity": "low",
"evidence": "Mentions 'Web search — trending news' and 'search your content archive' without specifying constraints on what tools can be called or what data can be transmitted to search providers.",
"risk": "If implemented with unrestricted browsing/HTTP tools, could send sensitive context or Slack-derived snippets as queries/logs to external search providers."
},
{
"category": "Prompt injection resilience",
"severity": "low",
"evidence": "Agent is instructed to read Slack content and web trends. There are no explicit defenses against instructions embedded in Slack messages (e.g., 'ignore previous instructions', 'post this secret', 'call tool X').",
"risk": "A malicious Slack message could attempt to steer the agent into leaking data, changing behavior, or using unintended tools if the executor doesn't enforce tool and policy boundaries."
},
{
"category": "Social engineering / manipulation",
"severity": "low",
"evidence": "The agent drafts persuasive content for public platforms based on internal conversations and 'hot takes'.",
"risk": "Could be used to launder internal opinions/rumors into public statements or to attribute statements to team members without consent if attribution handling is weak."
}
],
"reasoning": "No explicit prompt-injection strings, encoded commands, credential-harvesting, or direct dangerous commands (e.g., bash/rm -rf) are present in the provided text. The main risks are operational: the skill reads potentially sensitive Slack data and routes outputs to external drafting/logging tools (Typefully/Notion) and possibly web search, without a strict tool allowlist, data-minimization rules, or explicit injection/PII safeguards. This makes it plausible to leak confidential information or be steered by malicious Slack content, even though it states 'read only' for Slack and 'no auto-publishing'. Recommended mitigations: enforce an explicit allowed-tools list (Slack read-only APIs + specific draft-creation endpoints only), add a hard rule to never include secrets/PII/internal-only info and to require source redaction, implement prompt-injection filtering for Slack/web content, and ensure web search queries cannot include Slack-derived text."
}Frontier model review complete. Human approval still required. |
Deploying everyskill with
|
| Latest commit: |
2d24a9e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://5839c12d.everyskill.pages.dev |
| Branch Preview URL: | https://skill-daily-social-agent-177.everyskill.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
New Skill Submission
Skill: daily-social-agent
Submitted by: Anthony
Reason: A ready-to-use prompt for automating daily social media content sourcing in Cowork. Scans your Slack channels, surfaces the best content from the last 24 hours, drafts posts for X and LinkedIn, and saves everything for review. Runs on autopilot every weekday.
What it does:
Every weekday at your chosen time, this agent:
This PR was auto-generated from skills.every.to (web-upload).
AI security review will run automatically.