diff --git a/README.md b/README.md
index 207d5a3..152112b 100644
--- a/README.md
+++ b/README.md
@@ -9,14 +9,15 @@
 
 
 <div align="center">
- <img src="https://github.com/user-attachments/assets/d91b1b5d-c932-402c-b86d-2846620a68b0" width="800"/>
+ <img src="assets/pass-at-k-v-s-greedy-g-pass-at-k.png" width="800"/>
 </div>
 
 <!-- [🏰[Project Page](https://github.com/open-compass/GPassK/)]
 [📚[LeaderBoard](https://github.com/open-compass/GPassK/index.html)] -->
 
 ## 🚀 News
-- **[2024.12.18]** We release the **[ArXiv Paper](http://arxiv.org/abs/2412.13147)** of GPassK. 🎉🎉🎉
+- **[2025.1.6]** 🔥 **[LiveMathBench](https://huggingface.co/datasets/opencompass/LiveMathBench)** now can be accessed through hugginface, and you can now evaluate your LLMs on it using G-Pass@k in OpenCompass. We have addressed potential errors in LiveMathBench and inconsistencies in the sampling parameters. Please also refer to our updated version of the **[Paper](http://arxiv.org/abs/2412.13147)** for further details.
+- **[2024.12.18]** We release the **[ArXiv Paper](http://arxiv.org/abs/2412.13147)** of G-Pass@k. 🎉🎉🎉
 
 
 ## ☀️Introduction
@@ -24,7 +25,7 @@
 **G-Pass@k** is a novel evaluation metric that provides a continuous assessment of model performance across multiple sampling attempts, quantifying both the model’s peak performance potential and its stability. In addition, it comes with **LiveMathBench**, a dynamic benchmark comprising challenging, contemporary mathematical problems designed to minimize data leakage risks during evaluation. In order to track the latest performance and stability of LLMs, we will continue updating the benchmark with new comptition level mathmatical problems and provide the latest results of the models on the benchmark with G-Pass@k.
 
 
-## 🌲 Definition of GPassK
+## 🌲 Definition of G-Pass@k
 $$ \text{G-Pass@}k = \mathbb{E}_{\text{Questions}} \left[ \frac{{c \choose k}}{{n \choose k}} \right] $$ 
 
 where $n$ represents the total number of generations per question, and $c$ denotes the number
@@ -42,27 +43,95 @@ Intuitively, $\text{mG-Pass@}k$ provides an interpolated estimate of the area un
 *LiveMathBench-202412 version*
 
 <div align="center">
- <img src="https://github.com/user-attachments/assets/0e5d57c6-7fec-475e-acbe-cfa6aa2088cb" width="800"/>
+ <img src="assets/performance.png" width="800"/>
 </div>
 
 
-## 🖋Use GPassK in OpenCompass
+## 🖋Use G-Pass@k in OpenCompass
 [OpenCompass](https://github.com/open-compass/opencompass) is a toolkit for evaluating the performance of large language models (LLMs). To use GPassK in OpenCompass, you can follow the steps below:
-```python
-Coming Soon...
+
+### 1. Prepare Environment
+Follow these steps to ensure your environment is ready:
+
+```bash
+# Clone the main repository
+git clone https://github.com/open-compass/GPassK.git
+cd GPassK
+
+# Create and activate a conda environment with specific Python and PyTorch versions
+conda create -n livemathbench-eval python=3.10 pytorch torchvision torchaudio pytorch-cuda -c nvidia -c pytorch -y
+conda activate livemathbench-eval
+
+# Install additional required packages
+pip install loguru
+
+# Clone and install OpenCompass for extended functionality
+git clone https://github.com/open-compass/opencompass.git opencompass
+cd opencompass
+pip install -e .
+```
+
+
+### 2. Prepare Dataset
+LiveMathBench dataset can be obtained from HuggingFace. First, you should be granted to access the dataset from the following link: [huggingface](https://huggingface.co/datasets/opencompass/LiveMathBench).
+Then, refer to [security-tokens](https://huggingface.co/docs/hub/security-tokens) to set up your HF tokens.
+
+
+### 3. Deploy Judge Models
+We leverage Qwen2.5-72B-Instruct as the judge model for judging the correctness of generated answers. We recommend to deploy services using deployment tools such as [vllm](https://github.com/vllm-project/vllm) or [lmdeploy](https://github.com/InternLM/lmdeploy) for invocation by different evaluation tasks.
+
+Below is an example configuration for deploying the judge model using `lmdeploy`:
+```bash
+lmdeploy serve api_server Qwen/Qwen2.5-72B-Instruct --server-port 8000 \
+    --tp 4 \ # at least 4 A100 or equivalent GPUs are required
+    --cache-max-entry-count 0.9 \
+    --log-level INFO 
+```
+After setting up the judge model, define the URLs in the `eval_urls` within `opencompass_config_templates/*.py`. Adjust other parameters such as `k`， `temperatures`, `llm_infos`, and other params according to your needs.
+
+> ❗️Note that omitting `eval_urls` will default to an internal rule-based judge, which might only apply to datasets with numerical answers 
+
+### 4. Evaluation
+
+To begin the evaluation, first generate the necessary configuration files by running the following script:
+```bash
+python save_opencompass_configs.py --config_template_file {opencompass_config_templates/nono1.py|opencompass_config_templates/o1.py}
+```
+
+Upon execution, verify the generated configuration files located in `opencompass_configs/:
+
+```
+.
+├── deepseek-math-7b-rl_t0-3_p0-8_k50_rp1-0_rs42_l8192@LiveMathBench-v202412-k4_8_16-r3.py
+├── deepseek-math-7b-rl_t0-5_p0-8_k50_rp1-0_rs42_l8192@LiveMathBench-v202412-k4_8_16-r3.py
+├── deepseek-math-7b-rl_t0-7_p0-8_k50_rp1-0_rs42_l8192@LiveMathBench-v202412-k4_8_16-r3.py
+├── deepseek-math-7b-rl_t1-0_p0-8_k50_rp1-0_rs42_l8192@LiveMathBench-v202412-k4_8_16-r3.py
+```
+
+These files follow a naming convention that reflects the model settings and dataset used:
+```
+[MODEL_ABBR]_t[TEMPERATUE]_p[TOP_P]_k[TOP_K]_rp[REPETITION_PENALTY]_l[MAX_OUT_LEN]@[DATASET_ABBR]_k[LIST_OF_K]_r[REPLICATION].py
+```
+
+With the configurations prepared, initiate the evaluation process with the commands below:
+
+```bash
+cd GPassK
+conda activate livemathbench-eval
+python opencompass/run.py {path/to/config_file} \
+      -w ./opencompass_outputs/ \
+      --dump-eval-details \
 ```
+Refer to the OpenCompass documentation for additional arguments that may enhance your evaluation experience 
 
 
 # Citation and Tech Report
-If you use GPassK in your research, please cite the following paper:
+If you use G-Pass@k in your research, please cite the following paper:
 ```
-@misc{liu2024llmscapablestablereasoning,
-      title={Are Your LLMs Capable of Stable Reasoning?}, 
-      author={Junnan Liu and Hongwei Liu and Linchen Xiao and Ziyi Wang and Kuikun Liu and Songyang Gao and Wenwei Zhang and Songyang Zhang and Kai Chen},
-      year={2024},
-      eprint={2412.13147},
-      archivePrefix={arXiv},
-      primaryClass={cs.AI},
-      url={https://arxiv.org/abs/2412.13147}, 
+@article{liu2024your,
+  title={Are Your LLMs Capable of Stable Reasoning?},
+  author={Liu, Junnan and Liu, Hongwei and Xiao, Linchen and Wang, Ziyi and Liu, Kuikun and Gao, Songyang and Zhang, Wenwei and Zhang, Songyang and Chen, Kai},
+  journal={arXiv preprint arXiv:2412.13147},
+  year={2024}
 }
 ```
diff --git a/assets/pass-at-k-v-s-greedy-g-pass-at-k.png b/assets/pass-at-k-v-s-greedy-g-pass-at-k.png
new file mode 100644
index 0000000..a0acdc9
Binary files /dev/null and b/assets/pass-at-k-v-s-greedy-g-pass-at-k.png differ
diff --git a/assets/performance.png b/assets/performance.png
new file mode 100644
index 0000000..06b0914
Binary files /dev/null and b/assets/performance.png differ
diff --git a/docs/LiveMathBench-A.csv b/docs/LiveMathBench-A.csv
new file mode 100644
index 0000000..38512d0
--- /dev/null
+++ b/docs/LiveMathBench-A.csv
@@ -0,0 +1,20 @@
+Model,Greedy,G-Pass@16-0.5,G-Pass@16-0.75,G-Pass@16-1.0,mG-Pass@16,link,opensourced,mathLM,o1-like
+Llama-3.1-8B-Instruct,24.0,18.2,11.3,4.55,10.4,https://github.com/facebookresearch/llama,TRUE,FALSE,FALSE
+Llama-3.1-70B-Instruct,29.8,30.0,22.2,12.5,20.8,https://github.com/facebookresearch/llama,TRUE,FALSE,FALSE
+Llama-3.3-70B-Instruct,40.3,36.2,28.9,19.1,27.5,https://github.com/facebookresearch/llama,TRUE,FALSE,FALSE
+Qwen2.5-7B-Instruct,37.0,36.5,27.2,16.0,25.8,https://github.com/QwenLM/Qwen,TRUE,FALSE,FALSE
+Qwen2.5-32B-Instruct,50.8,48.3,39.5,28.6,38.1,https://github.com/QwenLM/Qwen,TRUE,FALSE,FALSE
+Qwen2.5-72B-Instruct,51.7,47.3,39.6,29.0,37.8,https://github.com/QwenLM/Qwen,TRUE,FALSE,FALSE
+DeepSeek-V2.5-1210,38.7,38.9,27.9,17.3,26.7,https://github.com/deepseek-ai/DeepSeek-LLM,TRUE,FALSE,FALSE
+DeepSeek-V3.0-Chat,55.0,59.5,49.9,35.0,47.9,https://github.com/deepseek-ai/DeepSeek-V3,TRUE,FALSE,FALSE
+Mistral-Large-Instruct-2411-123B,41.6,39.4,37.1,32.9,36.4,https://example.com/mistral,TRUE,FALSE,FALSE
+Gemini-1.5-Pro-Latest,59.1,55.9,47.3,31.0,44.3,https://example.com/gemini,FALSE,FALSE,FALSE
+Claude-3.5-Sonnet,46.7,44.1,36.2,26.6,35.3,https://docs.anthropic.com/claude/docs/models-overview,FALSE,FALSE,FALSE
+GPT-4o-2024-11-20,44.8,41.9,32.9,22.2,31.6,https://openai.com/research/gpt-4,FALSE,FALSE,FALSE
+DeepSeek-Math-7B-RL,23.5,19.8,14.0,9.7,13.7,https://github.com/deepseek-ai/DeepSeek-LLM,TRUE,TRUE,FALSE
+NuminaMath-72B-CoT,40.8,34.0,27.1,14.2,25.0,https://example.com/numinamath,TRUE,TRUE,FALSE
+Qwen2.5-Math-7B-Instruct,44.1,44.1,38.3,28.1,36.6,https://github.com/QwenLM/Qwen,TRUE,TRUE,FALSE
+Qwen2.5-Math-72B-Instruct,57.6,52.7,45.4,27.9,42.3,https://github.com/QwenLM/Qwen,TRUE,TRUE,FALSE
+Skywork-o1-8B,45.4,39.3,31.9,21.7,30.4,https://example.com/skywork,TRUE,FALSE,TRUE
+QwQ-32B-Preview,72.7,74.9,65.8,40.1,61.2,https://example.com/qwq,TRUE,FALSE,TRUE
+OpenAI o1-mini,74.1,76.3,67.3,48.3,64.8,https://openai.com/research/o1,FALSE,FALSE,TRUE
diff --git a/docs/index.html b/docs/index.html
new file mode 100644
index 0000000..96ed9ba
--- /dev/null
+++ b/docs/index.html
@@ -0,0 +1,691 @@
+<!DOCTYPE html>
+<html>
+
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+<link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@100;400&display=swap" rel="stylesheet">
+
+
+<head>
+  <meta charset="UTF-8">
+  <title>LiveMathBench Leaderboard</title>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.3.0/papaparse.min.js"></script>
+  <script src="https://cdn.jsdelivr.net/npm/echarts@5.3.3/dist/echarts.min.js"></script>
+  <link href='https://fonts.googleapis.com/css?family=Titillium+Web:400,600,400italic,600italic,300,300italic' rel='stylesheet' type='text/css'>
+  <link href="https://fonts.googleapis.com/css2?family=Material+Icons" rel="stylesheet">
+  <link rel="icon" href = "https://images.emojiterra.com/google/noto-emoji/unicode-15.1/color/512px/1f6e0.png">
+  <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0/dist/css/bootstrap.min.css">
+
+  <style>
+    body {
+      font-family: "Titillium Web", "HelveticaNeue-Light", "Helvetica Neue Light", "Helvetica Neue", Helvetica, Arial, "Lucida Grande", sans-serif;
+      font-weight: 300;
+      font-size: 20px;
+      background-color: #FFFFFF;
+      color: #000000;
+    }
+    .paper-btn {
+      position: relative;
+      text-align: center;
+
+      display: inline-block;
+      margin: 8px;
+      padding: 8px 8px;
+
+      border-width: 0;
+      outline: none;
+      border-radius: 2px;
+  
+      background-color: #E0F7FA;
+      color: #01579B !important;
+      font-size: 20px;
+      width: 200px;
+      font-weight: 600;
+    }
+    .custom-quote {
+      background-color: #fff8e3; /* 浅黄色背景 */
+      border-left: 4px solid #ffc107; /* 左侧黄色边框 */
+      padding: 10px 20px; /* 内边距 */
+      margin: 10px 0; /* 外边距 */
+      border-radius: 4px; /* 圆角 */
+      font-family: Arial, sans-serif; /* 字体 */
+  }
+
+  .custom-quote h5 {
+      margin: 0; /* 移除默认的 h5 边距 */
+  }
+
+  .custom-quote small {
+      color: #333; /* 深灰色文字 */
+      font-size: 16px; /* 文字大小 */
+  }
+    .paper-btn-tapestry {
+      position: relative;
+      text-align: center;
+
+      display: inline-block;
+      margin: 8px;
+      padding: 8px 8px;
+
+      border-width: 0;
+      outline: none;
+      border-radius: 2px;
+  
+      background-color: #5364cc;
+      color: white !important;
+      font-size: 20px;
+      width: 200px;
+      font-weight: 600;
+    }
+    .toggle-btn-parent {
+      text-align: center;
+      margin-top: 10px;
+    }
+
+    .toggle-btn {
+      background-color: #007bff;
+      color: white;
+      border: none;
+      padding: 10px 20px;
+      font-size: 16px;
+      cursor: pointer;
+      border-radius: 5px;
+      display: inline-flex;
+      align-items: center;
+      justify-content: center;
+    }
+
+    .toggle-btn .material-icons {
+      margin-right: 5px;
+    }
+
+    .toggle-btn:hover {
+      background-color: #0056b3;
+    }
+
+.paper-btn-parent {
+    display: flex;
+    justify-content: center;
+    margin: 16px 0px;
+}
+
+.paper-btn:hover {
+    opacity: 0.85;
+}
+
+    .paper-btn-parent {
+    display: flex;
+    justify-content: center;
+    margin: 16px 0px;
+    }
+    .paper-btn:hover {
+    opacity: 0.85;
+    }
+    .material-icons {
+    vertical-align: -6px;
+    }
+    .container {
+    margin-left: auto;
+    margin-right: auto;
+    padding-left: 16px;
+    padding-right: 16px;
+    }
+    .centered-div {
+      width: 70%;
+      margin: 0 auto;  
+    }
+    .bold-blue {
+    color: blue;
+    font-weight: bold;
+    }
+
+    #content {
+      width: 70%;
+    }
+
+    th,
+    td {
+      text-align: left;
+    }
+
+    th {
+      background-color: #f2f2f2;
+    }
+
+    #notes {
+      font-size: 1em;
+    }
+
+    #notes h3 {
+      margin-top: 1em;
+      font-size: 2em;
+      text-align: center;
+    }
+
+    #notes li {
+      font-weight: 300;
+      margin: 1em;
+    }
+
+    .form-select {
+      font-size: 1em;
+    }
+
+    @media screen and (max-width: 1400px) {
+      body {
+        font-size: 1.6vw;
+      }
+
+      #content {
+        width: 100%;
+      }
+
+      h1 {
+        font-size: 2em;
+      }
+
+      h2 {
+        font-size: 1.6em;
+      }
+
+      h3 {
+        font-size: 1.2em;
+      }
+
+      table {
+        font-size: small;
+      }
+
+      
+    }
+
+
+  </style>
+</head>
+
+<body>
+<div class="container">
+  <div id="content" class="container-fluid d-flex flex-column align-items-center gap-3">
+    <h1 class="text-nowrap mt-5">🏆 LiveMathBench Leaderboard 🏆</h1>
+    <h2 class="fw-light text-nowrap"><small id="warning">GPassK: Are Your LLMs Capable of Stable Reasoning? <br></small></h2>
+     <div style="clear: both">
+      <div class="paper-btn-parent">
+        <a class="paper-btn" href="https://arxiv.org/abs/2412.13147">
+            <span class="material-icons"> description </span> 
+             Paper
+        </a>
+        <a class="paper-btn" href="https://github.com/open-compass/GPassK">
+            <span class="material-icons"> code </span>
+            Code
+        </a>
+      </div>
+
+        <!-- <div class="toggle-btn-parent">
+          <button class="toggle-btn" id="toggleButton">
+              <span class="material-icons"> swap_horiz </span>
+              Show Theory Scores
+          </button>
+        </div> -->
+        <div class="alert alert-info custom-quote" role="alert">
+          <h5 class="fw-light text-nowrap">
+              <small id="warning">
+                  📢 Calling for Evaluation! If you want to see your model on the leaderboard, feel free to <a href="https://github.com/open-compass/GPassK/issues">contact</a> us!!!
+              </small>
+          </h5>
+      </div>
+    </div>
+
+
+    <div>
+      <div  id="chart" style="width:100%;height:600px;"></div>
+      <div class="container-fluid d-flex flex-row flex-nowrap">
+        <div class="container-fluid d-flex flex-column align-items-center">
+          <label for="origin" class="text-danger mb-3"> LiveMathBench-2412 G-PassK Results</label>
+          <table id="origin" class="table table-responsive table-striped table-bordered flex-shrink-1 border border-danger border-3"></table>
+        </div>
+      </div>
+      <div id = "notes">
+        <h4>📝 Notes</h4>
+        <p class="inline-block mt-3">
+          <ol>
+            <li>Models labeled with 🌍 are Closed-source models, while others are Open-sourced. </li>
+            <li>Models labeled with 🧮 are Mathematics-Specialization models. </li>
+            <li>Models labeled with 💡 are o1-like models with Long-cot. </li>
+            <!-- <li>Feel free to <a href="https://github.com/open-compass/GPassK/pulls">file a request</a> to add your models on our leaderboard. </li> -->
+          </ol>
+        </p>
+      </div>
+    </div>
+    <section>
+      <hr>
+    </section>
+  </div>
+
+  <script>
+    const originTable = document.getElementById('origin');
+    const benchmarkRadio = document.getElementById('Benchmark');
+    const chartDom = document.getElementById('chart');
+    
+    var myChart = echarts.init(chartDom);
+
+    var option = {
+      legend: {
+        data: ['mG-Pass@16*']
+      },
+      grid: {
+        left: '1%',
+        right: '4%',
+        bottom: '3%',
+        containLabel: true
+      },
+      xAxis: {
+        name: 'Size',
+        type: 'category',
+        boundaryGap: false,
+        data: [],
+        axisLabel: {
+          formatter: function(value) {
+            return value + 'B';
+          }
+        }
+      },
+      yAxis: {
+        name: 'mG-Pass@16',
+        type: 'value',
+        show: true,
+        nameTextStyle: {
+          align: 'left',
+        },
+        splitLine: {
+          show: true,
+          lineStyle: {
+            type: 'dashed'
+          }
+        }
+      },
+      legend: {
+        data: ['open_source', 'closed_source'],
+        itemStyle: {
+          opacity: 1.0
+        },
+      },
+      tooltip: {
+        trigger: 'item',
+        axisPointer: {
+          type: 'cross'
+        }
+      },
+      series: [{
+          name: 'open_source',
+          type: 'scatter',
+          data: [],
+          itemStyle: {
+            color: '#91cc75',
+            opacity: 0.2
+          },
+          emphasis: {
+            focus: 'series'
+          },
+          lineStyle: {
+            width: 2
+          },
+          markLine: {
+            symbol: 'none',
+            emphasis: {
+              label: {
+                position: 'middle',
+                formatter: function(params) {
+                  return params.data.name;
+                }
+              },
+            },
+            data: []
+          }
+        },
+        {
+          name: 'closed_source',
+          type: 'scatter',
+          data: [],
+          itemStyle: {
+            color: '#5470c6',
+            opacity: 0.2
+          },
+          emphasis: {
+            focus: 'series'
+          },
+          lineStyle: {
+            width: 2
+          },
+          markLine: {
+            symbol: 'none',
+            emphasis: {
+              label: {
+                position: 'middle',
+                formatter: function(params) {
+                  return params.data.name;
+                }
+              },
+            },
+            data: []
+          }
+        }
+      ]
+    };
+    
+    const theaders = [
+      'Model',
+      'Greedy',
+      'G-Pass@16-0.5',
+      'G-Pass@16-0.75',
+      'G-Pass@16-1.0',
+      'mG-Pass@16',
+    ]
+
+    var data = [];
+    var currentUrl = 'LiveMathBench-A.csv';
+
+    updateTable(originTable, currentUrl, 'mG-Pass@16');
+    updateChart(currentUrl);
+
+    function clearTable() {
+      originTable.innerHTML = '';
+    }
+
+    function clearChart() {
+      option.xAxis.data = [];
+      option.series[0].data = [];
+      option.series[1].data = [];
+      option.series[0].markLine.data = [];
+      option.series[1].markLine.data = [];
+    }
+
+    function updateTable(table, url, sortColumn) {
+      clearTable();
+      Papa.parse(url, {
+        download: true,
+        header: true,
+        skipEmptyLines: true,
+        complete: function (results) {
+          results.data.sort(function (a, b) {
+            return parseFloat(b[sortColumn]) - parseFloat(a[sortColumn]);
+          });
+          displayTable(table, results.data, sortColumn);
+        }
+      });
+    }
+    
+    function updateChart(url) {
+      clearChart();
+      Papa.parse(url, {
+        download: true,
+        header: true,
+        skipEmptyLines: true,
+        complete: function (results) {
+          // 打印数据行数
+          console.log('数据行数:', results.data.length);
+    
+          for (var i = 0; i < results.data.length; i++) {
+            var sizeMatch = results.data[i]['Model'].match(/\d+(\.\d+)?B/g);
+            sizeMatch = sizeMatch ? Math.round(parseFloat(sizeMatch[0].replace('B', ''))).toString() : 'N/A';
+            results.data[i]['Size'] = sizeMatch;
+          }
+          results.data.sort(function (a, b) {
+            if (parseFloat(a['Size']) - parseFloat(b['Size']) < 0) return -1;
+            if (parseFloat(a['Size']) - parseFloat(b['Size']) > 0) return 1;
+            return a['mG-Pass@16'] - b['mG-Pass@16'];
+          });
+          displayChart(results.data, url);
+        }
+      });
+    }
+
+
+    function displayTable(table, data, displayColumn){
+      var thead = document.createElement('thead');
+      var headerRow = document.createElement('tr');
+      // add rank
+      var th = document.createElement('th');
+      th.textContent = '#';
+      headerRow.appendChild(th);
+      // headers
+      theaders.forEach(function (header) {
+        var th = document.createElement('th');
+        th.textContent = header;
+        headerRow.appendChild(th);
+      });
+      thead.appendChild(headerRow);
+      table.appendChild(thead);
+
+      var tbody = document.createElement('tbody');
+      // add rank
+      var rank = 1;
+      data.forEach(function (row) {
+        var dataRow = document.createElement('tr');
+        var rankCell = document.createElement('td');
+        rankCell.textContent = rank;
+        dataRow.appendChild(rankCell);
+        var modelCell = document.createElement('td');
+        if (rank == 1) {
+          modelCell.textContent = '🥇 ';
+        } else if (rank == 2) {
+          modelCell.textContent = '🥈 ';
+        } else if (rank == 3) {
+          modelCell.textContent = '🥉 ';
+        } else {
+          modelCell.textContent = '';
+        }
+        rank++;
+        var modelLink = document.createElement('a');
+        modelLink.href = row['link'];
+        modelLink.textContent = row['Model'];
+        modelLink.classList.add('link-underline-primary');
+        modelLink.classList.add('text-nowrap');
+        modelCell.appendChild(modelLink);
+        modelCell.classList.add('d-flex');
+        modelCell.classList.add('flex-nowrap');
+        var opensourced = row['opensourced'];
+        var mathmodel = row['mathLM'];
+        var o1model = row['o1-like'];
+        if (opensourced == 'FALSE') {
+          var promptedSymbol = document.createElement('span');
+          promptedSymbol.textContent = '🌍';
+          modelCell.appendChild(promptedSymbol);
+        }
+        if (mathmodel == 'TRUE') {
+          var promptedSymbol = document.createElement('span');
+          promptedSymbol.textContent = '🧮';
+          modelCell.appendChild(promptedSymbol);
+        }
+        if (o1model == 'TRUE') {
+          var promptedSymbol = document.createElement('span');
+          promptedSymbol.textContent = '💡';
+          modelCell.appendChild(promptedSymbol);
+        }
+        dataRow.appendChild(modelCell);
+        var instructCell = document.createElement('td');
+        instructCell.classList.add('text-danger');
+        <!-- instructCell.textContent = row['Greedy']; -->
+        instructCell.textContent = (currentDataset === 'LiveMathBench-T.csv') ? '-' : row['Greedy'];
+        dataRow.appendChild(instructCell);
+        var instructCell = document.createElement('td');
+        instructCell.classList.add('text-danger');
+        instructCell.textContent = row['G-Pass@16-0.5'];
+        dataRow.appendChild(instructCell);
+        var instructCell = document.createElement('td');
+        instructCell.classList.add('text-danger');
+        instructCell.textContent = row['G-Pass@16-0.75'];
+        dataRow.appendChild(instructCell);
+        var instructCell = document.createElement('td');
+        instructCell.classList.add('text-danger');
+        instructCell.textContent = row['G-Pass@16-1.0'];
+        dataRow.appendChild(instructCell);
+        var passCell = document.createElement('td');
+        passCell.classList.add('bold-blue');
+        passCell.textContent += row[displayColumn];
+        dataRow.appendChild(passCell);
+        tbody.appendChild(dataRow);
+      });
+      table.appendChild(tbody);
+    }
+
+    function displayChart(data, url) {
+      var sizeSet = new Set();
+      sizeSet.add(0);
+      data.forEach(function(row) {
+        if (row['Size'] != 'N/A') {
+          sizeSet.add(row['Size']);
+        }
+      });
+      sizeSet.add(100);
+      sizeSet.forEach(function(size) {
+        option.xAxis.data.push(size);
+      });
+
+      var maxScore = 0.0;
+      data.forEach(function(row) {
+        if (parseFloat(row['mG-Pass@16']) > maxScore) {
+          maxScore = parseFloat(row['mG-Pass@16']);
+        }
+      });
+      option.yAxis.max = maxScore + 1;
+
+      data.forEach(function(row) {
+        if (row['Size'] == 'N/A') {
+          if (row['opensourced'] == 'FALSE') {
+            option.series[1].markLine.data.push({
+              name: row['Model'],
+              yAxis: row['mG-Pass@16']
+            });
+          } else {
+            option.series[0].markLine.data.push({
+              name: row['Model'],
+              yAxis: row['mG-Pass@16']
+            });
+          }
+        } else {
+          if (row['opensourced'] == 'FALSE') {
+            option.series[1].data.push({
+              name: row['Model'],
+              value: [row['Size'], row['mG-Pass@16']],
+              size: row['Size'],
+            });
+          } else {
+            option.series[0].data.push({
+              name: row['Model'],
+              value: [row['Size'], row['mG-Pass@16']],
+              size: row['Size'],
+            });
+          }
+        }
+      });
+
+      // select the highest model of each size
+      sizeSet.forEach(function(size) {
+        var maxScore = 0.0;
+        var maxScoreIns = 0.0;
+        var maxModel, maxModelIns, align;
+
+        data.forEach(function(row) {
+          if (row['Size'] == size) {
+            if(row['opensourced'] == 'FALSE') {
+              if (parseFloat(row['mG-Pass@16']) > maxScoreIns) {
+                maxScoreIns = parseFloat(row['mG-Pass@16']);
+                maxModelIns = row['Model'];
+              }
+            } else {
+              if (parseFloat(row['mG-Pass@16']) > maxScore) {
+                maxScore = parseFloat(row['mG-Pass@16']);
+                maxModel = row['Model'];
+              }
+            }
+          }
+        });
+        var count = 0;
+        option.series[0].data.forEach(function(row) {
+          if (row['size'] == size) {
+            count += 1;
+            if (count % 2 == 1){
+              offset = [40, 0];
+            } else {
+              offset = [-40, 0];
+            }
+            row.itemStyle = {
+              opacity: 1.0
+            };
+            row.label = {
+              show: true,
+              position: 'top',
+              offset: offset,
+              formatter: function(params) {
+                return params.data.name;
+              },
+              color: 'inherit'
+            };
+          }
+        });
+        option.series[1].data.forEach(function(row) {
+          var offset = [0, 0]; // Define the offset variable with an appropriate value
+          if (true) {
+            row.itemStyle = {
+              opacity: 1.0
+            };
+            row.label = {
+              show: true,
+              position: 'top',
+              offset: offset,
+              formatter: function(params) {
+                return params.data.name;
+              },
+              color: 'inherit'
+            };
+          }
+        });
+      });
+      option.series[1].markLine.data.forEach(function(row){
+        row.label = {
+          show: true,
+          position: 'middle',
+          formatter: function(params) {
+            return params.data.name;
+          },
+          color: 'inherit'
+        };
+      });
+      option && myChart.setOption(option);
+    }
+    // Toggle functionality for datasets
+    var currentDataset = 'LiveMathBench-A.csv';
+    var isApplicationScores = true;
+
+    function toggleDataset() {
+      if (isApplicationScores) {
+        currentDataset = 'LiveMathBench-T.csv';
+        document.querySelector('label[for="origin"]').textContent = 'LiveMathBench Theory Scores';
+        toggleButton.textContent = 'Show Application Scores';
+      } else {
+        currentDataset = 'LiveMathBench-A.csv';
+        document.querySelector('label[for="origin"]').textContent = 'LiveMathBench Application Scores';
+        toggleButton.textContent = 'Show Theory Scores';
+      }
+      isApplicationScores = !isApplicationScores;
+      updateChart(currentDataset);
+      updateTable(originTable, currentDataset, 'mG-Pass@16');
+    }
+
+    // Initial setup: hide Calculate column if starting with LiveMathBench-T.csv
+    if (currentDataset === 'LiveMathBench-T.csv') {
+      hideCalculateColumn();
+    }
+    // Add event listener to the toggle button
+    // document.getElementById('toggleButton').addEventListener('click', toggleDataset);
+
+
+    window.addEventListener("resize", () => {
+      this.myChart.resize();
+    });
+
+  </script> 
+</div>
+</body>
+
+</html>
\ No newline at end of file
diff --git a/opencompass_config_templates/nono1.py b/opencompass_config_templates/nono1.py
new file mode 100644
index 0000000..3b0f494
--- /dev/null
+++ b/opencompass_config_templates/nono1.py
@@ -0,0 +1,92 @@
+from itertools import product
+from mmengine.config import read_base
+
+with read_base():
+    from opencompass.configs.datasets.livemathbench.livemathbench_gen import livemathbench_datasets
+
+from opencompass.models import TurboMindModelwithChatTemplate
+
+
+k = [4, 8, 16]
+replication = 3
+version = '202412'
+temperatures = [1.0]
+
+max_out_len = 8192
+top_p = 0.8
+top_k = 50
+repetition_penalty = 1.0
+random_seed = 42
+
+eval_urls = [
+    # Put your judge model urls urls here
+]
+
+
+llm_infos = [
+    # model_name_or_path/tp/batch_size
+    ('Qwen/Qwen2.5-7B-Instruct', 4, 64),
+    ('Qwen/Qwen2.5-Math-7B-Instruct', 4, 64),
+    ('meta-llama/Llama-3.1-8B-Instruct', 8, 64),
+    ('meta-llama/Llama-3.1-70B-Instruct', 8, 64),
+    ('meta-llama/Llama-3.3-70B-Instruct', 8, 64),
+    ('mistralai/Mistral-Large-Instruct-2411', 8, 64),
+    ('Qwen/Qwen2.5-32B-Instruct', 8, 64),
+    ('Qwen/Qwen2.5-72B-Instruct', 8, 64),
+    ('01-ai/Yi-1.5-34B-Chat', 8, 64),
+    ('deepseek-ai/DeepSeek-V2.5-1210', 8, 16),
+    ('google/gemma-2-27b-it', 8, 64),
+    ('deepseek-ai/deepseek-math-7b-rl', 8, 64),
+    ('internlm/internlm2-math-plus-20b', 8, 64),
+    ('Qwen/Qwen2.5-Math-72B-Instruct', 8, 64)
+]
+
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr=llm_info[0].split('/')[-1] + \
+            f'_t{temperature}'
+            f'_p{top_p}'
+            f'_k{top_k}'
+            f'_rp{repetition_penalty}'
+            f'_rs{random_seed}'
+            f'_l{max_out_len}',
+        path=llm_info[0],
+        engine_config=dict(tp=llm_info[1]),
+        gen_config=dict(
+            do_sample=False if temperature < 1e-2 else True,
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            repetition_penalty=repetition_penalty,
+            random_seed=random_seed
+        ),
+        backend='turbomind',         
+        max_out_len=max_out_len,
+        batch_size=llm_info[2],
+        run_cfg=dict(num_gpus=llm_info[1])
+    ) for llm_info, temperature in product(llm_infos, temperatures)
+]
+
+
+livemathbench_dataset = livemathbench_datasets[0]
+livemathbench_dataset.update(dict(
+    k=k,
+    replication=replication,
+    dataset_splits=['CNMO', 'CCEE', 'AMC', 'WLPMC'], 
+    dataset_languages=['cn', 'en'],
+    cot=True,
+    version=version,
+    abbr=f'LiveMathBench-v{version}_k{"-".join(map(str, [k] if isinstance(k, int) else k))}_r{replication}'
+))
+livemathbench_dataset['eval_cfg']['evaluator'].update(dict(
+    model_name='Qwen/Qwen2.5-72B-Instruct',
+    url=eval_urls,
+    k=k,
+    replication=replication 
+))
+livemathbench_dataset['infer_cfg']['inferencer'].update(dict(
+    max_out_len=max_out_len
+))
+datasets = [livemathbench_dataset]
\ No newline at end of file
diff --git a/opencompass_config_templates/o1.py b/opencompass_config_templates/o1.py
new file mode 100644
index 0000000..d73e6e2
--- /dev/null
+++ b/opencompass_config_templates/o1.py
@@ -0,0 +1,80 @@
+from itertools import product
+from mmengine.config import read_base
+
+with read_base():
+    from opencompass.configs.datasets.livemathbench.livemathbench_gen import livemathbench_datasets
+
+from opencompass.models import TurboMindModelwithChatTemplate
+
+
+k = [4, 8, 16]
+replication = 3
+version = '202412'
+temperatures = [1.0]
+
+max_out_len = 32768
+top_p = 0.8
+top_k = 50
+repetition_penalty = 1.0
+random_seed = 42
+
+eval_urls = [
+    # Put your judge model urls urls here
+]
+
+
+llm_infos = [
+    # model_name_or_path/tp/batch_size
+    ('Qwen/QwQ-32B-Preview', 8, 64),
+    ('Skywork/Skywork-o1-Open-Llama-3.1-8B', 8, 64)
+]
+
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr=llm_info[0].split('/')[-1] + \
+            f'_t{temperature}'
+            f'_p{top_p}'
+            f'_k{top_k}'
+            f'_rp{repetition_penalty}'
+            f'_rs{random_seed}'
+            f'_l{max_out_len}',
+        path=llm_info[0],
+        engine_config=dict(tp=llm_info[1]),
+        gen_config=dict(
+            do_sample=False if temperature < 1e-2 else True,
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            repetition_penalty=repetition_penalty,
+            random_seed=random_seed
+        ),
+        backend='turbomind',         
+        max_out_len=max_out_len,
+        batch_size=llm_info[2],
+        run_cfg=dict(num_gpus=llm_info[1])
+    ) for llm_info, temperature in product(llm_infos, temperatures)
+]
+
+
+livemathbench_dataset = livemathbench_datasets[0]
+livemathbench_dataset.update(dict(
+    k=k,
+    replication=replication,
+    dataset_splits=['CNMO', 'CCEE', 'AMC', 'WLPMC'], 
+    dataset_languages=['cn', 'en'],
+    cot=True,
+    version=version,
+    abbr=f'LiveMathBench-v{version}-k{"_".join(map(str, [k] if isinstance(k, int) else k))}-r{replication}'
+))
+livemathbench_dataset['eval_cfg']['evaluator'].update(dict(
+    model_name='Qwen/Qwen2.5-72B-Instruct',
+    url=eval_urls,
+    k=k,
+    replication=replication 
+))
+livemathbench_dataset['infer_cfg']['inferencer'].update(dict(
+    max_out_len=max_out_len
+))
+datasets = [livemathbench_dataset]
\ No newline at end of file
diff --git a/save_opencompass_configs.py b/save_opencompass_configs.py
new file mode 100644
index 0000000..3d2de39
--- /dev/null
+++ b/save_opencompass_configs.py
@@ -0,0 +1,43 @@
+import os
+from argparse import ArgumentParser
+from typing import List
+from itertools import product
+
+from loguru import logger
+from mmengine import Config, mkdir_or_exist
+
+
+def load_and_dumpe_oc_configs(args) -> List[str]:
+    num_automatic_task = 0
+    save_dir = './opencompass_configs'
+    mkdir_or_exist(save_dir)
+    
+    cfg = Config.fromfile(args.config_template_file)
+    paths = []
+    for model_cfg, data_cfg in product(cfg['models'], cfg['datasets']):
+        save_path = os.path.join(save_dir, 
+                                 f"{model_cfg['abbr'].replace('.', '-')}"
+                                 f"@{data_cfg['abbr']}.py")
+        automatic_task_cfg = Config(dict(models=[model_cfg]))
+        automatic_task_cfg.merge_from_dict(dict(datasets=[data_cfg]))
+        automatic_task_cfg.dump(save_path)
+    
+        logger.info(f'|----------> Save opencompass config file to {save_path}')
+        paths.append(save_path)
+        num_automatic_task += 1
+        
+    logger.info(f'|----------> Complete saving {num_automatic_task} opencompass config files')
+    
+    return paths
+
+
+if __name__ == '__main__':
+    parser = ArgumentParser()
+    parser.add_argument('-tc', '--config_template_file', 
+                        type=str, 
+                        help='the path to opencompass '
+                             'config template file')
+    
+    args = parser.parse_args()
+    
+    load_and_dumpe_oc_configs(args)
\ No newline at end of file