Skip to content

Comments

Fix 13 : Create SimpleImputer#14

Open
Dhruv127 wants to merge 15 commits intoObliviousAI:masterfrom
Dhruv127:master
Open

Fix 13 : Create SimpleImputer#14
Dhruv127 wants to merge 15 commits intoObliviousAI:masterfrom
Dhruv127:master

Conversation

@Dhruv127
Copy link
Contributor

No description provided.

@Dhruv127 Dhruv127 changed the title Create SimpleImputer Fix 13 : Create SimpleImputer Dec 11, 2023
Copy link
Contributor

@jkfitzsimons jkfitzsimons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a method that is a drop in replacement of SimpleImputer in sklearn with similar interface. It looks like you accidentally submitted rough notes?

@Dhruv127
Copy link
Contributor Author

sorry for this issue , please check now .

Copy link
Contributor

@jkfitzsimons jkfitzsimons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using the numpy methods here and should be using the DP versions from this package - should be a small change


def _impute_mean(self, col, missing_col):
non_missing_values = col[~missing_col]
col_mean = np.nanmean(non_missing_values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using nanmean from numpy... shouldn't you use the dp version from this library ;)


def _impute_median(self, col, missing_col):
non_missing_values = col[~missing_col]
col_median = np.nanmedian(non_missing_values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using nanmedian from numpy... shouldn't you use the dp version from this library ;)

def _impute_median(self, col, missing_col):
non_missing_values = col[~missing_col]
col_median = np.nanmedian(non_missing_values)
sensitivity = np.nanmax(np.abs(non_missing_values - col_median))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using nanmax from numpy... shouldn't you use the dp version from this library ;)

elif self.strategy == 'constant':
self.statistics_ = [self.fill_value] * X.shape[1]
else:
self.statistics_ = [np.nanmean(col) if np.issubdtype(col.dtype, np.number) else np.nan for col in X.T]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using nanmean from numpy... shouldn't you use the dp version from this library ;)

@Dhruv127
Copy link
Contributor Author

Sorry bhaiya , i think there is no nanmax and nanmedium functions in this lib . Can you please check it once .
However i can try to make them , in another issue ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants