fix(loader): avoid NPE in FileLineFetcher close method & update HDFS test by bajisn-666 · Pull Request #710 · apache/hugegraph-toolchain

bajisn-666 · 2026-01-24T10:49:24Z

What is the purpose of the change
This PR fixes a NullPointerException in FileLineFetcher.java. The NPE occurred when the close() method was called, but the reader was null. This usually happens in multi-threaded environments or when file initialization fails.

Also updated HDFSLoadTest.java to use 4 threads to verify the fix in a concurrent scenario.

Related issues

Fixes #706

…threads

Copilot

Pull request overview

This PR addresses a NullPointerException issue in FileLineFetcher.java that occurred in multi-threaded environments when parser threads > 1. The fix adds null checks in the fetch(), skipOffset(), and checkMatchHeader() methods to handle cases where the reader might be null during concurrent execution.

Changes:

Added null safety checks in FileLineFetcher.java to prevent NPE in multi-threaded scenarios
Updated HDFS tests to use 4 parser threads to verify the fix works under concurrent load
Refactored HDFSUtil.java test utility to simplify HDFS configuration handling
Added .editorconfig rules for Parquet binary files to prevent editor corruption
Added a test Parquet file

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java	Added null checks in fetch(), skipOffset(), and checkMatchHeader() methods to handle null reader scenarios
hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSLoadTest.java	Increased parser thread count to 4 to test concurrent scenarios
hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSUtil.java	Simplified HDFS configuration and refactored utility methods
.editorconfig	Added configuration for Parquet files to prevent binary file corruption
hugegraph-loader/src/test/resources/parquet_compress_file/vertex_person.parquet	Added test Parquet file for testing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-24T13:50:28Z

hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java

+                // Fix NPE: check reader again inside loop
+                if (this.reader == null) {
+                    throw new LoadException("Reader is null when skipping offset of file %s",
+                                            readable);
+                }


Contradictory null checking logic: Line 201 enforces that reader must not be null with E.checkState(this.reader != null, ...), but then lines 206-209 add a null check inside the loop. If the reader can become null during iteration (e.g., in a multi-threaded environment), the initial check at line 201 is insufficient. If it cannot, then the null check inside the loop is unnecessary. Consider either: 1) removing the check at line 201 and relying on the null check in the loop, or 2) using synchronized access if reader can be modified by other threads.

Suggested change

// Fix NPE: check reader again inside loop

if (this.reader == null) {

throw new LoadException("Reader is null when skipping offset of file %s",

readable);

}

Copilot · 2026-01-24T13:50:29Z

hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java

+            // Fix NPE: check if reader is null before reading
+            if (this.reader == null) {
+                return null;
+            }


PR title mentions "avoid NPE in FileLineFetcher close method" but no changes were made to the closeReader() method (lines 151-155), which already has a null check for this.reader. The actual changes are in fetch(), skipOffset(), and checkMatchHeader() methods. Consider updating the PR title to accurately reflect what was changed, e.g., "fix: add null checks in FileLineFetcher fetch/skipOffset/checkMatchHeader methods".

Copilot · 2026-01-24T13:50:29Z

hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSLoadTest.java

                "-g", GRAPH,
                "-h", SERVER,
                "--batch-insert-threads", "2",
+                "--parser-threads", "4", // <--- 【修改点 1】增加并发解析线程数


Comment contains Chinese text in a codebase that appears to use English for comments. For consistency, consider translating to English, e.g., "Modification point 1: Increase concurrent parser thread count".

Copilot · 2026-01-24T13:50:29Z

hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSLoadTest.java

                "-g", GRAPH,
                "-h", SERVER,
                "--batch-insert-threads", "2",
+                "--parser-threads", "4", // <--- 【修改点 2】增加并发解析线程数


Comment contains Chinese text in a codebase that appears to use English for comments. For consistency, consider translating to English, e.g., "Modification point 2: Increase concurrent parser thread count".

Copilot · 2026-01-24T13:50:29Z

hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSUtil.java

        try {
-            FileUtil.copy(new File(srcPath), this.hdfs, new Path(destPath),
-                          false, this.conf);
+            // 通常测试场景是将本地文件上传到 HDFS


Comment contains Chinese text in a codebase that appears to use English for comments. For consistency, consider translating to English, e.g., "Usually in test scenarios, upload local files to HDFS".

Suggested change

// 通常测试场景是将本地文件上传到 HDFS

// Usually in test scenarios, upload local files to HDFS

Copilot · 2026-01-24T13:50:29Z

.editorconfig

+indent_size = 4

+# ---------------------------------------------------------
+# 关键修改：防止编辑器破坏 Parquet 二进制文件


Comment contains Chinese text. For consistency with the rest of the file and international collaboration, consider using English, e.g., "Key change: Prevent editor from corrupting Parquet binary files".

Suggested change

# 关键修改：防止编辑器破坏 Parquet 二进制文件

# Key change: Prevent editor from corrupting Parquet binary files

imbajin · 2026-01-24T15:05:39Z

hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java

-import org.apache.hugegraph.loader.reader.line.Line;
-import org.apache.hugegraph.loader.reader.line.LineFetcher;
-import org.apache.hugegraph.loader.source.file.Compression;
-import org.apache.hugegraph.loader.source.file.FileFormat;
-import org.apache.hugegraph.loader.source.file.FileSource;
-import org.slf4j.Logger;
-
 import org.apache.hugegraph.loader.parser.CsvLineParser;
 import org.apache.hugegraph.loader.parser.JsonLineParser;
 import org.apache.hugegraph.loader.parser.LineParser;
 import org.apache.hugegraph.loader.parser.TextLineParser;
 import org.apache.hugegraph.loader.reader.Readable;
+import org.apache.hugegraph.loader.reader.line.Line;
+import org.apache.hugegraph.loader.reader.line.LineFetcher;
+import org.apache.hugegraph.loader.source.file.Compression;
+import org.apache.hugegraph.loader.source.file.FileFormat;
+import org.apache.hugegraph.loader.source.file.FileSource;
 import org.apache.hugegraph.util.E;
 import org.apache.hugegraph.util.Log;
+import org.slf4j.Logger;


use style file to avoid code format diff

https://github.com/apache/incubator-hugegraph-toolchain/blob/master/.editorconfig (use it to override original IDE format)

imbajin · 2026-01-24T15:06:17Z

hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java

+                    throw new LoadException("Reader is null when skipping offset of file %s",
+                                            readable);


hit 100 chars in one line here?

imbajin · 2026-01-24T15:07:34Z

hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java

seems the judgement logic in method is not precise & duplicated

maybe we have a better solution for it

codecov · 2026-01-24T15:18:52Z

Codecov Report

❌ Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.49%. Comparing base (b066b80) to head (c6a2b0d).
⚠️ Report is 66 commits behind head on master.

Files with missing lines	Patch %	Lines
.../hugegraph/loader/reader/file/FileLineFetcher.java	25.00%	4 Missing and 2 partials ⚠️

❗ There is a different number of reports uploaded between BASE (b066b80) and HEAD (c6a2b0d). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (b066b80) HEAD (c6a2b0d)

2 1

Additional details and impacted files

@@              Coverage Diff              @@
##             master     #710       +/-   ##
=============================================
- Coverage     62.49%   51.49%   -11.01%     
+ Complexity     1903      973      -930     
=============================================
  Files           262      111      -151     
  Lines          9541     5861     -3680     
  Branches        886      755      -131     
=============================================
- Hits           5963     3018     -2945     
+ Misses         3190     2580      -610     
+ Partials        388      263      -125

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bajisn-666 added 2 commits January 24, 2026 16:52

fix(loader): fix issue 706 (apache#706)

703ba9f

fix: solve NPE in FileLineFetcher and update HDFSLoadTest with multi-…

0aedbce

…threads

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 24, 2026

github-actions bot added the loader hugegraph-loader label Jan 24, 2026

dosubot bot added the bug Something isn't working label Jan 24, 2026

bajisn-666 added 5 commits January 24, 2026 19:00

trigger actions again

416f52a

Fix CI errors: update license and fix compilation

3811ede

Fix: HDFSLoadTest formatting or logic

0b0e2cc

fix: correct .editorconfig format and add parquet exclusion

54e6938

fix: disable hdfs cache to avoid filesystem closed exception apache#706

4776e7e

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jan 24, 2026

update: fix code style and warnings

c6a2b0d

imbajin requested a review from Copilot January 24, 2026 13:41

Copilot started reviewing on behalf of imbajin January 24, 2026 13:41 View session

Copilot AI reviewed Jan 24, 2026

View reviewed changes

imbajin reviewed Jan 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix(loader): avoid NPE in FileLineFetcher close method & update HDFS test#710

fix(loader): avoid NPE in FileLineFetcher close method & update HDFS test#710
bajisn-666 wants to merge 8 commits intoapache:masterfrom
bajisn-666:fix-issue-706

bajisn-666 commented Jan 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

Copilot AI Jan 24, 2026

Uh oh!

imbajin Jan 24, 2026

Uh oh!

imbajin Jan 24, 2026

Uh oh!

imbajin Jan 24, 2026

Uh oh!

codecov bot commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	// 通常测试场景是将本地文件上传到 HDFS
	// Usually in test scenarios, upload local files to HDFS

	# 关键修改：防止编辑器破坏 Parquet 二进制文件
	# Key change: Prevent editor from corrupting Parquet binary files

		throw new LoadException("Reader is null when skipping offset of file %s",
		readable);

Comments

Conversation

bajisn-666 commented Jan 24, 2026

Related issues

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

imbajin Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

imbajin Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

imbajin Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 24, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants