Skip to content

jpom IEEE parser#182

Open
jpom wants to merge 2 commits intoadsabs:mainfrom
jpom:jpomieee
Open

jpom IEEE parser#182
jpom wants to merge 2 commits intoadsabs:mainfrom
jpom:jpomieee

Conversation

@jpom
Copy link

@jpom jpom commented Dec 12, 2025

IEEE parser

@seasidesparrow seasidesparrow self-requested a review December 12, 2025 16:21
@jpom jpom changed the title Jpomieee jpom IEEE parser Dec 12, 2025
# Conference location
if self.publicationinfo.find("conflocation") is not None:
confloc = self.publicationinfo.find("conflocation").text
self.base_metadata["conf_location"] = confloc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will fail if self.publicationinfo.find("conflocation") is None -- the variable confloc will be undefined at L96.

Copy link
Member

@seasidesparrow seasidesparrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good first round. You need to make sure the project passes existing IEEE unit tests, and then add unit tests specifically for conference proceedings.

# self.base_metadata["collection"] = colls_uniq

# TO DO: append confDates & confLocation to %J
if confdate:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need an option to not output confloc if it is NoneType or ""

# Article sequence number
if articleinfo.find("articleseqnum"):
articleid = articleinfo.find("articleseqnum").get_text()
self.base_metadata["electronic_id"] = articleid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, if there are firstPage, lastPage and articleseqnum, I'd like to field all three. Right now, the existing test cases have a firstPage and lastPage under pagination, so I'd like to add electronicID rather than using it exclusively.

# Conference volume number
if self.volumeinfo:
self.base_metadata["volume"] = self.volumeinfo.find("volumenum").get_text()
self.base_metadata["issue"] = self.volumeinfo.find("issue").find("issuenum").get_text()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore L50, because you're losing issue numbers when available.

else:
self.base_metadata["volume"] = ""

# Conferences don't have an issue number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the parser should be able to handle both conferences and journal articles without giving special instructions to the parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants