Conversation
This change introduces context-aware analysis to Oasis, allowing it to consider the programming language and framework of the scanned codebase. Users can specify the language and framework via command-line arguments. Oasis will now automatically detect the technology stack based on file extensions and framework-specific files if the language is not manually specified. This context is then used to tailor the security analysis and provide more relevant results. Several language contexts (PHP, Python, JavaScript, Java, Ruby) with common frameworks have been added.
Reviewer's Guide by SourceryThis pull request introduces context-aware analysis to Oasis. It allows users to specify the programming language and framework, or Oasis can automatically detect the technology stack. This context is then used to tailor the security analysis and provide more relevant results. The implementation includes adding command-line arguments, creating a Sequence diagram for technology stack detectionsequenceDiagram
participant Oasis
participant TechnologyContextManager
Oasis->>TechnologyContextManager: detect_technology_stack()
alt Language specified in args
TechnologyContextManager->>TechnologyContextManager: _get_manual_technology_context()
TechnologyContextManager->>TechnologyContextManager: _validate_language_context(language)
alt Framework specified
TechnologyContextManager->>TechnologyContextManager: _validate_framework_context(language, framework)
end
else No language specified
TechnologyContextManager->>TechnologyContextManager: _auto_detect_stack()
TechnologyContextManager->>TechnologyContextManager: _detect_primary_language(input_path)
alt Language detected
TechnologyContextManager->>TechnologyContextManager: _detect_framework(input_path, detected_language)
end
end
TechnologyContextManager-->>Oasis: language, framework
Sequence diagram for security analysis with technology contextsequenceDiagram
participant Oasis
participant SecurityAnalyzer
participant TechnologyContextManager
Oasis->>SecurityAnalyzer: SecurityAnalyzer(...)
alt Language detected
Oasis->>SecurityAnalyzer: set_technology_context(language, framework)
SecurityAnalyzer->>TechnologyContextManager: load_context(language, framework)
SecurityAnalyzer->>SecurityAnalyzer: self.current_tech_stack = {language, framework}
end
SecurityAnalyzer->>SecurityAnalyzer: _build_analysis_prompt(vuln_name, vuln_desc, vuln_patterns, code)
SecurityAnalyzer->>TechnologyContextManager: get_security_context()
SecurityAnalyzer-->>Oasis: prompt
Updated class diagram for TechnologyContextManagerclassDiagram
class TechnologyContextManager {
-contexts_dir: Path
-loaded_context: Dict
-tech_stack: Dict
-language_extensions: Dict
-framework_indicators: Dict
+__init__(contexts_dir: str)
+_load_language_configurations()
+load_context(language: str, framework: Optional[str]) : Dict
+detect_technology_stack() : Tuple[str, Optional[str]]
+_get_manual_technology_context() : Tuple[Optional[str], Optional[str]]
+_validate_language_context(language: str) : bool
+_validate_framework_context(language: str, framework: str) : bool
+_auto_detect_stack() : Tuple[Optional[str], Optional[str]]
+_detect_primary_language(input_path: Path) : Optional[str]
+_count_file_extensions(input_path: Path) : Dict[str, int]
+_detect_framework(input_path: Path, language: str) : Optional[str]
+_check_framework_indicators(input_path: Path, indicators: list) : bool
+get_language_extensions(language: Optional[str]) : list
+get_ignore_patterns() : list
+get_security_context() : str
+_load_yaml(path: Path) : Dict
}
Updated class diagram for SecurityAnalyzerclassDiagram
class SecurityAnalyzer {
-args
-llm_model: str
-embedding_manager: EmbeddingManager
-ollama_manager: OllamaManager
-scan_model: str
-tech_context: TechnologyContextManager
-current_tech_stack: Dict
+__init__(args, llm_model: str, embedding_manager: EmbeddingManager, ollama_manager: OllamaManager, scan_model: str)
+set_technology_context(language: str, framework: Optional[str])
+_get_vulnerability_details(vulnerability: Union[str, Dict]) : Tuple[str, str, list, str, str]
+_build_analysis_prompt(vuln_name: str, vuln_desc: str, vuln_patterns: list, code: str) : str
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey @psyray - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding more context files for other languages and frameworks to expand the tool's capabilities.
- The automatic technology detection logic could be made more robust by considering multiple factors.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| return self.loaded_context | ||
|
|
||
| def detect_technology_stack(self) -> Tuple[str, Optional[str]]: |
There was a problem hiding this comment.
issue (bug_risk): Usage of self.args without initialization
Several methods (e.g. detect_technology_stack and _auto_detect_stack) reference self.args (like self.args.language and self.args.input_path) even though self.args is not set anywhere in TechnologyContextManager. To avoid runtime errors, consider either passing args during initialization or providing a mechanism to inject them before these methods are used.
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| class TechnologyContextManager: |
There was a problem hiding this comment.
issue (complexity): Consider refactoring the TechnologyContextManager class into smaller, focused components such as a YAMLLoader and a TechnologyDetector to improve clarity and separation of concerns, and to inject dependencies explicitly to improve testability .
Consider breaking down the monolithic `TechnologyContextManager` into smaller, focused components to improve clarity and separation of concerns. For example:
1. **Extract YAML loading:**
Create a dedicated utility class for YAML loading. This isolates file I/O and error handling from context management.
```python
class YAMLLoader:
@staticmethod
def load(path: Path) -> Dict:
try:
if path.exists():
with open(path) as f:
return yaml.safe_load(f)
logger.warning(f"File not found: {path}")
except Exception as e:
logger.error(f"Error loading {path}: {e}")
return {}Then update your manager:
def _load_yaml(self, path: Path) -> Dict:
return YAMLLoader.load(path)-
Separate auto-detection logic:
Move technology stack detection (both manual and automatic) into a dedicated detector class. This reduces method responsibilities in your context manager.class TechnologyDetector: def __init__(self, args, language_extensions, framework_indicators): self.args = args self.language_extensions = language_extensions self.framework_indicators = framework_indicators def get_manual_context(self) -> Tuple[Optional[str], Optional[str]]: language = self.args.language.lower() framework = getattr(self.args, 'framework', None) framework = framework.lower() if framework else None return language, framework def auto_detect_context(self, input_path: Path) -> Tuple[Optional[str], Optional[str]]: # ... perform detection using language_extensions and framework_indicators ... return detected_language, detected_framework
Then your manager can delegate detection:
def detect_technology_stack(self) -> Tuple[str, Optional[str]]: detector = TechnologyDetector(self.args, self.language_extensions, self.framework_indicators) if getattr(self.args, 'language', None): return detector.get_manual_context() detected = detector.auto_detect_context(Path(self.args.input_path)) return detected
-
Inject dependencies explicitly:
Instead of accessingself.argsdirectly throughout, inject anargsdependency (or its required parameters) via the constructor. This makes dependencies explicit and easier to test.class TechnologyContextManager: def __init__(self, contexts_dir: str = "oasis/contexts", args=None): self.contexts_dir = Path(contexts_dir) self.args = args # ...
These focused changes keep functionality intact while reducing complexity and improving testability.
…, Drupal, Nuxt, React, CodeIgniter, Symfony, and Sinatra Contexts This commit significantly expands Oasis's vulnerability detection capabilities by adding nine new contexts: Java/Spring, PHP (WordPress, Drupal, CodeIgniter, Symfony), JavaScript (Nuxt, React), and Ruby (Sinatra). It also enhances the existing PHP Laravel context with more comprehensive vulnerability patterns. These additions provide more specific and accurate vulnerability detection across a wider range of popular frameworks.
This commit introduces significant improvements to technology context handling and enables more adaptive security analysis. It adds automatic technology stack detection, including support for multiple languages and frameworks, and integrates this context into the analysis process. Vulnerability patterns are now loaded based on the detected stack, and framework-specific patterns are merged with base patterns for more accurate analysis. Additionally, a new analysis context manager is introduced to streamline context setup and access during analysis. The commit also includes updates to the README to document the new features and command-line arguments. Finally, ignore patterns in context files are now quoted to prevent issues with special characters.
This change enhances the code readability of the TechnologyContextManager and updates the dependencies used for scanning. The logic for identifying file extensions and checking framework indicators is simplified. Additionally, the example command in the README is updated to reflect new preferred scanning models.
This change introduces context-aware analysis to Oasis, allowing it to consider the programming language and framework of the scanned codebase. Users can specify the language and framework via command-line arguments. Oasis will now automatically detect the technology stack based on file extensions and framework-specific files if the language is not manually specified. This context is then used to tailor the security analysis and provide more relevant results. Several language contexts (PHP, Python, JavaScript, Java, Ruby) with common frameworks have been added.
Summary by Sourcery
Introduce context-aware analysis to tailor security analysis based on the programming language and framework of the scanned codebase. The technology stack can be specified via command-line arguments, or automatically detected based on file extensions and framework-specific files. The context is then used to tailor the security analysis and provide more relevant results.
New Features: