Android 用 HttpClient 抓取 html 页面内容的方法
2010-02-27 04:28:00 来源:WEB开发网用的类库为commons-httpclient-3.1.jar.有兴趣的下载去。代码如下:
view sourceprint?01.private String getHtmlContent(final String url) {
02. String result = "";// 返回的结果
03. StringBuffer resultBuffer = new StringBuffer();
04. // 构造HttpClient的实例
05. HttpClient httpClient = new HttpClient();
06. // 创建GET方法的实例
07. GetMethod getMethod = new GetMethod(url);
08. // 使用系统提供的默认的恢复策略
09. getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
10. new DefaultHttpMethodRetryHandler());
11. // getMethod.getParams().setParameter(HttpMethodParams.HTTP_CONTENT_CHARSET,"GB2312");
12. getMethod.getParams().setContentCharset("GB2312");
13. try {
14. // 执行getMethod
15. int statusCode = httpClient.executeMethod(getMethod);
16. if (statusCode != HttpStatus.SC_OK) {
17. System.err.println("Method failed: "
18. + getMethod.getStatusLine());
19. }
20. // 流式读取
21. // 读取内容
22. // byte[] responseBody = getMethod.getResponseBody();
23. // 处理内容
24. // String result = new String(responseBody,"GBK");
25. // result = getMethod.getResponseBodyAsString();
26. // System.out.println(result);
27. // System.out.println(getMethod.getResponseCharSet());
28. // 推荐做法
29. BufferedReader in = new BufferedReader(new InputStreamReader(
30. getMethod.getResponseBodyAsStream(), getMethod
31. .getResponseCharSet()));
32. String inputLine = null;
33. while ((inputLine = in.readLine()) != null) {
34. resultBuffer.append(inputLine);
35. resultBuffer.append(" ");
36. }
37. result = new String(resultBuffer);
38. return result;
Tags:Android HttpClient 抓取
编辑录入:coldstar [复制链接] [打 印]更多精彩
赞助商链接