-
-
Notifications
You must be signed in to change notification settings - Fork 311
Description
Data Quality Issue Details
Issue Type:
- [x ] Incorrect financial values (wrong numbers)
- Missing financial data (expected data not present)
- Calculation errors (formulas producing wrong results)
- Data inconsistency (different values for same metric)
- Historical data problems (changes over time)
Environment
EdgarTools Version: 5.17.1
**Python Version:3.10.19 **
Operating System:win 11
Financial Data Details
Company/Ticker: AZN
Form Type: 20-F (e.g., 10-K, 10-Q, 8-K)
**Filing Date/Period:2024 **
Statement Type:Balance Sheet, Income Statement, Cash Flow Statement
Specific Metric/Concept:
- Financial line item: Several
- XBRL concept name: Several
Data Issue
Expected Value:
Provided Excel files with info for all three
Actual Value from EdgarTools:
Code to reproduce:
from edgar import Company
company_ticker = input("Enter company ticker (e.g., MSFT): ").upper()
write_files = input("Do you want to save CSV files of the sheets? Y/N ").upper()
number_years = int(input("Enter the number(years) of annual reports desired = "))
company = Company(company_ticker)
ef = company.get_filings(form=["10-K", "10-K/A"])
filings = ef.latest(number_years)
if filings.latest() is None:
Taxonomy = 'IFRS'
filings = company.get_filings(form="20-F", amendments=False).latest(number_years)
if filings.latest() is None:
CIK = input("Either the ticker is wrong or it is traded on the OTC. Please provide the CIK: ")
company = Company(CIK)
filings = company.get_filings(form="20-F", amendments=False).latest(number_years)
if filings is None:
print('No SEC Files ')
sys.exit()
print("Selected filings passed to XBRLS:")
ef_10k = company.get_filings(form=["10-K", "10-K/A"])
if len(ef_10k) == 0:
print("No 10-K filings found — switching to 20-F (IFRS).")
Taxonomy = 'IFRS'
ef_20f = company.get_filings(form="20-F", amendments=False)
if len(ef_20f) == 0:
CIK = input("Either the ticker is wrong or it is OTC. Enter the CIK: ")
company = Company(CIK)
ef_20f = company.get_filings(form="20-F", amendments=False)
if len(ef_20f) == 0:
print("No SEC filings found.")
sys.exit()
filings = ef_20f.latest(number_years)
else:
Taxonomy = 'GAAP'
filings = ef_10k.latest(number_years)
xbrls = XBRLS.from_filings(filings)
income_statement = xbrls.statements.income_statement(view="detailed")
balance_sheet = xbrls.statements.balance_sheet(view="detailed")
cash_flow = xbrls.statements.cashflow_statement(view="detailed")
bs_df = balance_sheet.to_dataframe()
is_df = income_statement.to_dataframe()
cf_df = cash_flow.to_dataframe()Cross-Verification
Have you verified this issue with:
- Multiple time periods for same company
- Multiple companies with same issue
- [x ] Direct SEC filing comparison
- Other financial data sources
Affects multiple periods/companies?
- Companies tested: AZN
- Time periods tested: 2024
- Pattern observed: (e.g., all Q4 periods affected, only certain companies)
Expected Behavior
What should happen:
I know you just pushed the recent capability for ifrs, so this is not surprising. I reviewed the output for AZN in 2024 and highlighted information in the excel file. [Unfortunately, I chose a poor color scheme]
yellow = match
red/salmon = error
green = not sure, mainly the number is correct, but the sign is the opposite of what is in the annual report
See the attached excel files
AZN_Balance_Sheets.xlsx
AZN_Income_Statement.xlsx
AZN_Cash_Flow_Statement.xlsx
Data validation rules:
- Should the value be positive/negative?
- Expected magnitude/range?
- Should it match specific calculations?
Additional Context
- (https://www.astrazeneca.com/investor-relations/annual-reports/annual-report-2024.html)
- pages 148 - 151
Impact Assessment:
- Minor (affects specific edge case)
- Moderate (affects common use cases)
- Major (affects core financial calculations)
- Critical (produces completely wrong results)
Data quality issues are high priority and will be verified against official SEC filings. Accuracy is fundamental to EdgarTools.