开发学院软件开发 Java 用 WEKA 进行数据挖掘，第 3 部分: 最近邻和服务器... 阅读

用 WEKA 进行数据挖掘，第 3 部分: 最近邻和服务器端库

　2010-06-23 00:00:00　来源：WEB开发网　　　

核心提示： 现在我们已经将数据载入了 WEKA，虽然比想象中的要稍微难一点，用 WEKA 进行数据挖掘，第 3 部分: 最近邻和服务器端库(10)，但您可以看到编写自己的包装器类来快速从数据库提取数据并将其放入一个 WEKA 实例类还是很简单和有益的，实际上，让我们把我们的数据通过回归模型进行处理并确保输出

现在我们已经将数据载入了 WEKA。虽然比想象中的要稍微难一点，但您可以看到编写自己的包装器类来快速从数据库提取数据并将其放入一个 WEKA 实例类还是很简单和有益的。实际上，我强烈建议如果打算在服务器上使用 WEKA，那么就不要怕花时间，因为以这种方式处理数据是很繁琐的。一旦将数据放入了这个实例对象，您就可以在数据上进行任何您想要的数据挖掘了，所以您想要这个步骤尽可能地简单。

让我们把我们的数据通过回归模型进行处理并确保输出与我们使用 Weka Explorer 计算得到的输出相匹配。实际上使用 WEKA API 让数据通过回归模型得到处理非常简单，远简单于实际加载数据。

清单 5. 在 WEKA 内创建回归模型

//　Create　the　LinearRegression　model,　which　is　the　data　mining　 //　model　we're　using　in　this　example　 LinearRegression　linearRegression　=　new　LinearRegression();　　 //　This　method　does　the　"magic",　and　will　compute　the　regression　 //　model.　It　takes　the　entire　dataset　we've　defined　to　this　point　 //　When　this　method　completes,　all　our　"data　mining"　will　be　complete　 //　and　it　is　up　to　you　to　get　information　from　the　results　 linearRegression.buildClassifier(dataset);　　 //　We　are　most　interested　in　the　computed　coefficients　in　our　model,　 //　since　those　will　be　used　to　compute　the　output　values　from　an　 //　unknown　data　instance.　 double[]　coef　=　linearRegression.coefficients();　　 //　Using　the　values　from　my　house　(from　the　first　article),　we　 //　plug　in　the　values　and　multiply　them　by　the　coefficients　 //　that　the　regression　model　created.　Note　that　we　skipped　 //　coefficient[5]　as　that　is　0,　because　it　was　the　output　 //　variable　from　our　training　data　 double　myHouseValue　=　(coef[0]　*　3198)　+　　　　　　　　　　　　(coef[1]　*　9669)　+　　　　　　　　　　　　(coef[2]　*　5)　+　　　　　　　　　　　　(coef[3]　*　3)　+　　　　　　　　　　　　(coef[4]　*　1)　+　　　　　　　　　　　　coef[6];　　 System.out.println(myHouseValue);　 //　outputs　219328.35717359098　 //　which　matches　the　output　from　the　earlier　article