Conversation
…e fields
- add alt_name, descriptors, lineage columns
- add thiols (low/medium/high)
- add agronomic: yield_min/max, maturity, wilt_disease, downy_mildew, powdery_mildew, aphid
- rename aroma_miscellaneous -> aroma_misc
- change substitutes from flat array to {brewhouse, dryhopping} JSON object
- update Hop model fillable, casts and docblock accordingly
- MapHopData: extract alt_name, descriptors, lineage, thiols, all agronomic fields
- MapHopData: store substitutes as {brewhouse, dryhopping} nested structure
- UpsertHop: add validation rules for all new fields (thiols: in:low,medium,high)
- UpsertHop: buildAttributes stores substitutes with brewhouse/dryhopping keys preserved
- ImportHopVarietyJobTest: assert aroma_misc (renamed from aroma_miscellaneous)
- ImportHopVarietyJobTest: assert substitutes as nested brewhouse/dryhopping structure
- HopFactory: add all missing fields (alt_name, descriptors, lineage, thiols, agronomic)
- HopFactory: fix substitutes to use correct nested {brewhouse, dryhopping} format
- HopFactory: use countryCode() instead of country() for realistic 2-letter codes
Storage::fake() creates real files in this directory during tests and does not clean them up automatically after each run.
There was a problem hiding this comment.
Pull request overview
This pull request implements the hop dataset import functionality requested in issue #6. It adds an Artisan command hops:import that loads hop variety data from JSON/JSON5 files stored in the local disk, processes them asynchronously via queued jobs, and stores them in the database with proper validation. The implementation includes comprehensive test coverage for both success and error scenarios.
Changes:
- Added
hops:importArtisan command to scan and dispatch import jobs for JSON/JSON5 files - Implemented queued job to parse, validate, and upsert individual hop varieties
- Created actions for data mapping (JSON structure to internal format) and database upsertion
- Extended database schema and model to support additional hop attributes (lineage, thiols, agronomic data, disease resistance)
- Renamed
Rangevalue object toRangeOrNumberfor clarity - Added JSON5 parser helper to support unquoted keys in data files
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| app/Console/Commands/HopsImportCommand.php | New command to scan storage directory and dispatch import jobs |
| app/Jobs/ImportHopVarietyJob.php | Queued job to process individual hop variety files with error logging |
| app/Actions/MapHopData.php | Maps JSON structure to internal data format |
| app/Actions/UpsertHop.php | Validates and upserts hop data with comprehensive validation rules |
| app/Helpers/Json5Parser.php | Simple JSON5 parser to handle unquoted keys |
| app/Models/Hop.php | Extended model with new fields and proper PHPDoc annotations |
| app/ValueObjects/RangeOrNumber.php | Renamed from Range for better semantic clarity |
| database/migrations/2026_02_18_224936_create_hops_table.php | Added new columns for hop attributes and renamed aroma_miscellaneous to aroma_misc |
| database/factories/HopFactory.php | Updated factory to generate data for new fields |
| tests/Helpers/HopFixture.php | Test fixtures providing sample JSON5 hop data |
| tests/Feature/Jobs/ImportHopVarietyJobTest.php | Comprehensive job tests including error scenarios |
| tests/Feature/Console/Commands/HopsImportCommandTest.php | Command tests verifying job dispatch and file filtering |
| tests/Feature/RangeOrNumberTest.php | Updated tests for renamed class |
| .gitignore | Added pattern to ignore test disk storage |
| storage/framework/testing/.gitignore | Removed (replaced by root gitignore entry) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| $folder = $this->argument("folder"); | ||
|
|
There was a problem hiding this comment.
The folder argument is passed directly to Storage::files() without validation. While Laravel's Storage facade should handle path traversal attempts, it's a good practice to explicitly validate that the folder path doesn't contain suspicious characters like '..' to prevent potential path traversal attacks. Consider adding validation such as checking for '..' or validating against a whitelist of allowed characters.
| $folder = $this->argument("folder"); | |
| $folder = trim((string) $this->argument("folder")); | |
| // Basic validation to prevent path traversal and disallow suspicious characters | |
| if ( | |
| $folder === "" || | |
| strpos($folder, "..") !== false || | |
| !preg_match('#^[A-Za-z0-9_\-/]+$#', $folder) | |
| ) { | |
| $this->error("Invalid folder argument. Please provide a relative folder name using only letters, numbers, '/', '_', and '-' characters."); | |
| return; | |
| } |
| class Json5Parser | ||
| { | ||
| public function parse(string $content): ?array | ||
| { | ||
| $json = preg_replace('/(?<=[{,\n])\s*([a-zA-Z_]\w*)\s*:/m', '"$1":', $content); | ||
|
|
||
| $data = json_decode($json, true); | ||
|
|
There was a problem hiding this comment.
The regular expression used for JSON5 parsing is too simplistic and may fail on valid JSON5 syntax. It only handles unquoted keys but doesn't support other JSON5 features like comments, trailing commas, or multi-line strings. Additionally, the pattern could incorrectly match keys that appear after array/object values. Consider using a dedicated JSON5 parsing library or document the specific subset of JSON5 that is supported.
| class Json5Parser | |
| { | |
| public function parse(string $content): ?array | |
| { | |
| $json = preg_replace('/(?<=[{,\n])\s*([a-zA-Z_]\w*)\s*:/m', '"$1":', $content); | |
| $data = json_decode($json, true); | |
| use ColinODell\Json5\Json5Decoder; | |
| class Json5Parser | |
| { | |
| public function parse(string $content): ?array | |
| { | |
| try { | |
| $data = Json5Decoder::decode($content, true); | |
| } catch (\Throwable $e) { | |
| return null; | |
| } |
| $attributes[$field] = ($min !== null && $max !== null) | ||
| ? RangeOrNumber::fromRange((float)$min, (float)$max) | ||
| : null; |
There was a problem hiding this comment.
The validation rules allow min and max values to be independently nullable, but the RangeOrNumber creation logic at lines 131-133 only creates a range when both min and max are non-null. This means if only one value is present, it will result in a null range, which may silently discard valid data. Consider adding a validation rule to ensure both min and max are present together, or handle the case where only one is provided.
| $attributes[$field] = ($min !== null && $max !== null) | |
| ? RangeOrNumber::fromRange((float)$min, (float)$max) | |
| : null; | |
| if ($min === null && $max === null) { | |
| $attributes[$field] = null; | |
| } elseif ($min !== null && $max !== null) { | |
| $attributes[$field] = RangeOrNumber::fromRange((float)$min, (float)$max); | |
| } elseif ($min !== null) { | |
| // Only a minimum provided; treat it as a single-value range. | |
| $attributes[$field] = RangeOrNumber::fromRange((float)$min, (float)$min); | |
| } else { | |
| // Only a maximum provided; treat it as a single-value range. | |
| $attributes[$field] = RangeOrNumber::fromRange((float)$max, (float)$max); | |
| } |
| "bitterness" => $this->faker->optional()->randomElement(Bitterness::values()), | ||
| "aromaticity" => $this->faker->optional()->randomElement(Aromaticity::values()), |
There was a problem hiding this comment.
The 'bitterness' and 'aromaticity' fields have been changed to use 'optional()' which means they might not be present in the factory-generated data. However, this creates inconsistency with other nullable fields in the factory. Consider whether these fields should always have values in test data for consistency, or if this change aligns with the domain logic where some hops may not have these classifications.
| "bitterness" => $this->faker->optional()->randomElement(Bitterness::values()), | |
| "aromaticity" => $this->faker->optional()->randomElement(Aromaticity::values()), | |
| "bitterness" => $this->faker->randomElement(Bitterness::values()), | |
| "aromaticity" => $this->faker->randomElement(Aromaticity::values()), |
| $attributes = [ | ||
| "slug" => Str::slug($validated["name"]) . "-hop", |
There was a problem hiding this comment.
The slug generation uses Str::slug($validated["name"]) . "-hop" which always appends "-hop" to the slug. This could lead to redundant slugs like "cascade-hop-hop" if a hop's name already contains "hop". Consider either checking if the name already ends with "hop" before appending, or removing "hop" from the name before slugification.
| $attributes = [ | |
| "slug" => Str::slug($validated["name"]) . "-hop", | |
| $baseSlug = Str::slug($validated["name"]); | |
| if (!Str::endsWith($baseSlug, "-hop")) { | |
| $baseSlug .= "-hop"; | |
| } | |
| $attributes = [ | |
| "slug" => $baseSlug, |
| "descriptors" => $validated["descriptors"] ?? [], | ||
| "lineage" => $validated["lineage"] ?? [], | ||
| "thiols" => $validated["thiols"] ?? null, | ||
| "aroma_citrusy" => $validated["aroma_citrusy"] ?? null, | ||
| "aroma_fruity" => $validated["aroma_fruity"] ?? null, | ||
| "aroma_floral" => $validated["aroma_floral"] ?? null, | ||
| "aroma_herbal" => $validated["aroma_herbal"] ?? null, | ||
| "aroma_spicy" => $validated["aroma_spicy"] ?? null, | ||
| "aroma_resinous" => $validated["aroma_resinous"] ?? null, | ||
| "aroma_sugarlike" => $validated["aroma_sugarlike"] ?? null, | ||
| "aroma_misc" => $validated["aroma_misc"] ?? null, | ||
| "aroma_descriptors" => $validated["aroma_descriptors"] ?? [], | ||
| "substitutes" => [ | ||
| "brewhouse" => $validated["substitutes"]["brewhouse"] ?? [], | ||
| "dryhopping" => $validated["substitutes"]["dryhopping"] ?? [], | ||
| ], |
There was a problem hiding this comment.
The buildAttributes method always sets empty arrays for descriptors, lineage, aroma_descriptors, and substitutes even when they are nullable fields. This prevents distinguishing between "no data provided" (null) and "empty list provided" ([]). Consider using null when the validated data doesn't contain these keys to maintain semantic correctness.
| ImportHopVarietyJob::dispatch($file); | ||
| } | ||
|
|
||
| $this->info("All import jobs have been dispatched to queue, run `php artisan queue:work` to process them."); |
There was a problem hiding this comment.
The command message at line 40 assumes jobs are being queued asynchronously and tells users to run 'php artisan queue:work'. However, if the queue connection is set to 'sync', the jobs will already be processed and this message could be confusing. Consider checking the queue connection and adjusting the message accordingly, or clarifying that the message only applies when using async queues.
| $files = collect(Storage::disk("local")->files($folder)) | ||
| ->filter(fn(string $file): bool => in_array( | ||
| pathinfo($file, PATHINFO_EXTENSION), | ||
| ["json", "json5"], | ||
| true, | ||
| )); |
There was a problem hiding this comment.
The command doesn't handle the case where the specified folder doesn't exist. Storage::files() may throw an exception or return an unexpected result if the folder is missing. Consider adding a check for folder existence and providing a helpful error message to the user if the folder doesn't exist.
| { | ||
| public function parse(string $content): ?array | ||
| { |
There was a problem hiding this comment.
The JSON5 parser uses preg_replace with user-controlled content which could potentially be exploited if very large files are processed, leading to ReDoS (Regular Expression Denial of Service). Consider adding a file size limit check before parsing, or using a more efficient parsing approach.
| { | |
| public function parse(string $content): ?array | |
| { | |
| { | |
| private const MAX_CONTENT_LENGTH = 1048576; // 1 MB limit to mitigate potential ReDoS | |
| public function parse(string $content): ?array | |
| { | |
| if (strlen($content) > self::MAX_CONTENT_LENGTH) { | |
| return null; | |
| } |
| class Json5Parser | ||
| { |
There was a problem hiding this comment.
Is it came from requiremnts? Or its some kind of warranty before fancy JSON data?
|
|
||
| public function handle(Json5Parser $parser, MapHopData $mapper, UpsertHop $upsert): void | ||
| { | ||
| $content = Storage::disk("local")->get($this->filePath); |
There was a problem hiding this comment.
Right now u are fixed on local storage but maybe we can use s3 or smth like that. Let's make that more flexible. The same with uploading and savng files
use Illuminate\Filesystem\FilesystemManager;
$filePath = new FilesystemManager()->readStream($path);
| $table->integer("yield_min")->nullable(); | ||
| $table->integer("yield_max")->nullable(); | ||
| $table->string("maturity")->nullable(); | ||
| $table->string("wilt_disease")->nullable(); | ||
| $table->string("downy_mildew")->nullable(); | ||
| $table->string("powdery_mildew")->nullable(); | ||
| $table->string("aphid")->nullable(); |
There was a problem hiding this comment.
We are on early stage of the project and its does not matter but you should create new one migration file to add new fileds to the table next time.
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/citra.json5"); | ||
|
|
||
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/mosaic.json5"); | ||
|
|
||
| Bus::assertNotDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/not-a-hop.txt"); |
There was a problem hiding this comment.
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/citra.json5"); | |
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/mosaic.json5"); | |
| Bus::assertNotDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/not-a-hop.txt"); | |
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/citra.json5"); | |
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/mosaic.json5"); | |
| Bus::assertNotDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/not-a-hop.txt"); |
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/citra.json5"); | ||
|
|
||
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/mosaic.json5"); | ||
|
|
||
| Bus::assertNotDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "hops_data/not-a-hop.txt"); |
| $this->artisan("hops:import", ["folder" => "custom_folder"]) | ||
| ->assertExitCode(0); | ||
|
|
||
| Bus::assertDispatched(ImportHopVarietyJob::class, fn($job) => $job->filePath === "custom_folder/galaxy.json5"); |
| Storage::fake("local"); | ||
| Log::shouldReceive("warning") | ||
| ->once() | ||
| ->withArgs(fn(string $msg) => Str::contains($msg, "File not found")); |
|
|
||
| Log::shouldReceive("warning") | ||
| ->once() | ||
| ->withArgs(fn(string $msg) => Str::contains($msg, "Failed to parse")); |
|
|
||
| Log::shouldReceive("warning") | ||
| ->once() | ||
| ->withArgs(fn(string $msg) => Str::contains($msg, "Validation failed")); |
| "alpha_acid_min" => $data["ingredients"]["alphas"]["min"] ?? null, | ||
| "alpha_acid_max" => $data["ingredients"]["alphas"]["max"] ?? null, | ||
| "beta_acid_min" => $data["ingredients"]["betas"]["min"] ?? null, | ||
| "beta_acid_max" => $data["ingredients"]["betas"]["max"] ?? null, | ||
| "cohumulone_min" => $data["ingredients"]["cohumulones"]["min"] ?? null, | ||
| "cohumulone_max" => $data["ingredients"]["cohumulones"]["max"] ?? null, | ||
| "total_oil_min" => $data["ingredients"]["oils"]["min"] ?? null, | ||
| "total_oil_max" => $data["ingredients"]["oils"]["max"] ?? null, | ||
| "polyphenol_min" => $data["ingredients"]["polyphenols"]["min"] ?? null, | ||
| "polyphenol_max" => $data["ingredients"]["polyphenols"]["max"] ?? null, | ||
| "xanthohumol_min" => $data["ingredients"]["xanthohumols"]["min"] ?? null, | ||
| "xanthohumol_max" => $data["ingredients"]["xanthohumols"]["max"] ?? null, | ||
| "farnesene_min" => $data["ingredients"]["farnesenes"]["min"] ?? null, | ||
| "farnesene_max" => $data["ingredients"]["farnesenes"]["max"] ?? null, | ||
| "linalool_min" => $data["ingredients"]["linalool"]["min"] ?? null, | ||
| "linalool_max" => $data["ingredients"]["linalool"]["max"] ?? null, |
There was a problem hiding this comment.
Maybe use your VO RangeOrNumber?
This should close #6