Skip to content

ValueError: a must be greater than 0 unless no samples are taken #110

@shuyu-rich

Description

@shuyu-rich

I'm using a local model, and I don't have a dataset. There's no data in my Argilla account as well. I can only generate some data myself based on the keywords in the code that are used to read datasets. Now there's an error, and I don't know where the problem lies.
Here is my error message:

D:\lla\AutoPrompt-main\AutoPrompt\utils\config.py:6: LangChainDeprecationWarning: Importing HuggingFacePipeline from langchain.llms is deprecated. Please replace
deprecated imports:

from langchain.llms import HuggingFacePipeline

with new imports of:

Starting step 0
Processing samples: 0it [00:00, ?it/s]
Processing samples: 0it [00:00, ?it/s]
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Previous prompt score:
nan
#########

Get new prompt:
D:/lla/AutoPrompt-main/prompts/predictor_completion/prediction.prompt
Processing samples: 0%| | 0/1 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.39s/it]
Starting step 1
Processing samples: 0%| | 0/2 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:07<00:00, 4.00s/it]
Processing samples: 0it [00:00, ?it/s]
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Previous prompt score:
nan
#########

Get new prompt:
D:/lla/AutoPrompt-main/prompts/predictor_completion/prediction.prompt
Processing samples: 0%| | 0/1 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.02s/it]
Starting step 2
Processing samples: 0%| | 0/2 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:06<00:00, 3.35s/it]
Processing samples: 0it [00:00, ?it/s]
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Previous prompt score:
nan
#########

Get new prompt:
D:/lla/AutoPrompt-main/prompts/predictor_completion/prediction.prompt
Processing samples: 0%| | 0/1 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.95s/it]
Starting step 3
Processing samples: 0%| | 0/1 [00:00<?, ?it/s]S
etting pad_token_id to eos_token_id:50256 for open-end generation.
Processing samples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.63s/it]
Processing samples: 0it [00:00, ?it/s]
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Previous prompt score:
nan
#########

Get new prompt:
D:/lla/AutoPrompt-main/prompts/predictor_completion/prediction.prompt
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\lla\AutoPrompt-main\AutoPrompt\dataset\base_dataset.py:145 in sample_records │
│ │
│ 142 │ │ │ │ df_samples = self.records.head(n) │
│ 143 │ │ else: │
│ 144 │ │ │ try: │
│ ❱ 145 │ │ │ │ df_samples = self.records.sample(n) │
│ 146 │ │ │ except: │
│ 147 │ │ │ │ n = 1 # 保证样本大小至少为 1 │
│ 148 │ │ │ │ df_samples = self.records.sample(n=n) │
│ │
│ C:\Users\PS\AppData\Roaming\Python\Python310\site-packages\pandas\core\generic.py:5773 in sample │
│ │
│ 5770 │ │ if weights is not None: │
│ 5771 │ │ │ weights = sample.preprocess_weights(self, weights, axis) │
│ 5772 │ │ │
│ ❱ 5773 │ │ sampled_indices = sample.sample(obj_len, size, replace, weights, rs) │
│ 5774 │ │ result = self.take(sampled_indices, axis=axis) │
│ 5775 │ │ │
│ 5776 │ │ if ignore_index: │
│ │
│ C:\Users\PS\AppData\Roaming\Python\Python310\site-packages\pandas\core\sample.py:150 in sample │
│ │
│ 147 │ │ else: │
│ 148 │ │ │ raise ValueError("Invalid weights: weights sum to zero") │
│ 149 │ │
│ ❱ 150 │ return random_state.choice(obj_len, size=size, replace=replace, p=weights).astype( │
│ 151 │ │ np.intp, copy=False │
│ 152 │ ) │
│ 153 │
│ │
│ in numpy.random.mtrand.RandomState.choice:909 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: a must be greater than 0 unless no samples are taken

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\lla\AutoPrompt-main\AutoPrompt\run_pipeline.py:44 in │
│ │
│ 41 pipeline = OptimizationPipeline(config_params, task_description, initial_prompt, output_ │
│ 42 if (opt.load_path != ''): │
│ 43 │ pipeline.load_state(opt.load_path) │
│ ❱ 44 best_prompt = pipeline.run_pipeline(opt.num_steps) │
│ 45 print('\033[92m' + 'Calibrated prompt score:', str(best_prompt['score']) + '\033[0m') │
│ 46 print('\033[92m' + 'Calibrated prompt:', best_prompt['prompt'] + '\033[0m') │
│ 47 │
│ │
│ D:\lla\AutoPrompt-main\AutoPrompt\optimization_pipeline.py:281 in run_pipeline │
│ │
│ 278 │ │ # Run the optimization pipeline for num_steps │
│ 279 │ │ num_steps_remaining = num_steps - self.batch_id │
│ 280 │ │ for i in range(num_steps_remaining): │
│ ❱ 281 │ │ │ stop_criteria = self.step(i, num_steps_remaining) │
│ 282 │ │ │ if stop_criteria: │
│ 283 │ │ │ │ break │
│ 284 │ │ final_result = self.extract_best_prompt() │
│ │
│ D:\lla\AutoPrompt-main\AutoPrompt\optimization_pipeline.py:273 in step │
│ │
│ 270 │ │ │ self.log_and_print('Stop criteria reached') │
│ 271 │ │ │ return True │
│ 272 │ │ if current_iter != total_iter-1: │
│ ❱ 273 │ │ │ self.run_step_prompt() │
│ 274 │ │ self.save_state() │
│ 275 │ │ return False │
│ 276 │
│ │
│ D:\lla\AutoPrompt-main\AutoPrompt\optimization_pipeline.py:137 in run_step_prompt │
│ │
│ 134 │ │ │ │ │ batch['extra_samples'] = extra_samples_text │
│ 135 │ │ │ else: │
│ 136 │ │ │ │ for batch in batch_inputs: │
│ ❱ 137 │ │ │ │ │ extra_samples = self.dataset.sample_records() │
│ 138 │ │ │ │ │ extra_samples_text = DatasetBase.samples_to_text(extra_samples) │
│ 139 │ │ │ │ │ batch['history'] = 'No previous errors information' │
│ 140 │ │ │ │ │ batch['extra_samples'] = extra_samples_text │
│ │
│ D:\lla\AutoPrompt-main\AutoPrompt\dataset\base_dataset.py:148 in sample_records │
│ │
│ 145 │ │ │ │ df_samples = self.records.sample(n) │
│ 146 │ │ │ except: │
│ 147 │ │ │ │ n = 1 # 保证样本大小至少为 1 │
│ ❱ 148 │ │ │ │ df_samples = self.records.sample(n=n) │
│ 149 │ │ │
│ 150 │ │ return df_samples │
│ 151 │
│ │
│ C:\Users\PS\AppData\Roaming\Python\Python310\site-packages\pandas\core\generic.py:5773 in sample │
│ │
│ 5770 │ │ if weights is not None: │
│ 5771 │ │ │ weights = sample.preprocess_weights(self, weights, axis) │
│ 5772 │ │ │
│ ❱ 5773 │ │ sampled_indices = sample.sample(obj_len, size, replace, weights, rs) │
│ 5774 │ │ result = self.take(sampled_indices, axis=axis) │
│ 5775 │ │ │
│ 5776 │ │ if ignore_index: │
│ │
│ C:\Users\PS\AppData\Roaming\Python\Python310\site-packages\pandas\core\sample.py:150 in sample │
│ │
│ 147 │ │ else: │
│ 148 │ │ │ raise ValueError("Invalid weights: weights sum to zero") │
│ 149 │ │
│ ❱ 150 │ return random_state.choice(obj_len, size=size, replace=replace, p=weights).astype( │
│ 151 │ │ np.intp, copy=False │
│ 152 │ ) │
│ 153 │
│ │
│ in numpy.random.mtrand.RandomState.choice:909 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: a must be greater than 0 unless no samples are taken

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions