2004年08月23日 星期一 15:17
Hollo limodou:
嘿嘿嘿!!拿的下来的是也乎!!?!?
shRequest.add_header("Accept-Language","zh-cn")
shRequest.add_header("Content-Type","text/html; charset=gb2312")
shRequest.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)")
关键要模拟声明你的确切信息………………!
/******** [2004-08-23]15:16:49 ; limodou wrote:
limodou> guochen,您好!
limodou> 我想可能是浏览器的agent信息,因为netant可以发送这个信息的。而且还可以改的。应该是在http头加上这个信息就行了。你可以试试。
limodou> ======= 2004-08-23 14:35:52 您在来信中写道:=======
>>limodou,您好!
>>
>> 但是用网络蚂蚁可以下载得到
>>
>>======= 2004-08-23 14:33:00 您在来信中写道:=======
>>
>>>guochen,您好!
>>>
>>> 它可能是有一些浏览器的一些信息判断。不过不知道它都判断什么了。
>>>
>>>======= 2004-08-23 14:23:58 您在来信中写道:=======
>>>
>>>>谁能用程序获得这个页面?
>>>>http://directory.google.com/Top/Sports/Basketball/College_and_University/NCAA_Division_I/Pacific-10_Conference/
>>>>偶用urllib.urlopen()建立连接然后read,结果forbidden
>>>>用HTTPConnection建立连接以后怎么得到页面呢?
>>>>
>>>>
>>>>_______________________________________________
>>>>python-chinese list
>>>>python-chinese at lists.python.cn
>>>>http://python.cn/mailman/listinfo/python-chinese
>>>
>>>= = = = = = = = = = = = = = = = = = = =
>>>
>>>
>>> 致
>>>礼!
>>>
>>>
>>> limodou
>>> chatme at 263.net
>>> 2004-08-23
>>>
>>>_______________________________________________
>>>python-chinese list
>>>python-chinese at lists.python.cn
>>>http://python.cn/mailman/listinfo/python-chinese
>>
>>= = = = = = = = = = = = = = = = = = = =
>>
>>
>> 致
>>礼!
>>
>>
>> guochen
>> guochen at 1218.com.cn
>> 2004-08-23
>>
>>
>>
>>_______________________________________________
>>python-chinese list
>>python-chinese at lists.python.cn
>>http://python.cn/mailman/listinfo/python-chinese
limodou> = = = = = = = = = = = = = = = = = = = =
limodou> 致
limodou> 礼!
limodou> limodou
limodou> chatme at 263.net
limodou> 2004-08-23
********************************************/
--
Free as in Freedom
Zoom.Quiet
#=========================================#
]Time is unimportant, only life important![
#=========================================#
sender is the Bat!2.12.00
-------------- next part --------------
# -*- coding: utf-8 -*-
# 版权 2004 啄木鸟基金会
# 保留一切权利。
#
# 重新分发或者使用源代码格式和二进制格式,无论是否修改,必须在符合
# 以下许可条件的情况下进行:
# 1. 重新分发的源代码必须保留上述版权信息、本许可条件列表以及下列
# 否认声明。
# 2. 重新分发的二进制格式程序必须再次产生上述版权信息,本许可条件列
# 表以及在本分发中的文档或者其他材料中提供的否认声明。
#
# 本软件是由作者和参与者“照此”提供,并且任何显式或隐式的保证,只
# 包含在内但不被其限制,否认任何商业性和特殊目的适切性的隐式担保
# 之效力。本软件的作者或参与者不对由于本软件造成的任何直接地、
# 间接地、连带地、特殊地、惩戒性地,或者由此而产生地损害(还包括
# 以下的不被其限制的内容在内:替代品或服务地取得;使用过程、数据、
# 或由此产生的收益之损失;或商业行为的中断)发生在即使被告知可能
# 出现上述损害后,仍设法摆脱使用本软件的过程中,无论其起因和任何
# 责任地推测,其是否属于合同范畴、限定的义务、或者民事侵权行为
#(包括疏忽及其他)都否认其责任。
#
#
#
# Copyright (c) 2004 The Woodpecker Foundation
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
# $Header: /woodpecker/otter/ottertools/timer.py,v 1.4 2004/08/18 04:25:49 hd Exp $
"""
通用Python 程序运行计时器
@author: U{Zoom.Quietzoomq at infopro.cn>}
@version: 1.0
"""
import sys,os,string,time
class timer:
"""计时器主类;利用 self.log 来记录信息
示例代码:
>>> watch = timer()
>>> watch.start()
>>> ....运行脚本
>>> print(watch.step())
>>> ....运行脚本
>>> print(watch.step())
"""
def __init__(self):
self.log=""
def __repr__(self):# 类自述定义
print("利用Python 内含time 模块进行代码计时!")
return self.log
def start(self):
"""
初始化所有环境;并跑秒
关键字:
log -- 记录运行状态
start -- 开始时间点
"""
self.start= time.time()
self.log += "\n run at:"+time.strftime(" %Y-%m-%d %X",time.localtime(self.start))
return self.log
def stop(self):
"""
读秒!
关键字:
log -- 记入结束时间点!
"""
self.stop = time.time()
self.log += "\n end at:"+time.strftime(" %Y-%m-%d %X",time.localtime(self.stop))
self.log += "\n 本次运行共用时 %s秒"% (self.stop-self.start)
return self.log
def step(self):
"""
步秒!
关键字:
log -- 累计不同时间点信息!
"""
self.stop = ""
self.stop = time.time()
self.log += "\n end at:"+time.strftime(" %Y-%m-%d %X",time.localtime(self.stop))
self.log += "\n 本次运行共用时 %s秒"% (self.stop-self.start)
return self.log
if __name__ == '__main__': # 自测试
watch = timer()
if(watch):
import CBfilter
playCB = CBfilter.CBfilter() # imported by other programs as well
watch.start()
result = playCB.play(4)
print(watch.step())
print "#"*7
print(watch.step())
#print result
else:
print("\"\"")
-------------- next part --------------
# -*- coding: utf-8 -*-
# file wget.py
#/**
# @brief 网络页面抓取通用类
# @version 1.1 040823 独立
# @version 1.0 040105 使用Request 对象,发送适当HTTP 请求,获得中文页面信息!
# @version v0.1 040104 Original design
# @author Zoom Quiet (zoomq at itcase.com)
# @attention Released under GNU Lesser GPL library license
# @par
# @return
# @sa
#*/
import sys, os, glob
import string,re
import urllib2
import getopt
class wget:
def __init__(self):
self.log=""
self.log+="\n"
self.log+="\n"
#self.log+="\n"
#os.chdir(_fpath) # 进入文件所在目录
self.param = ["wget.htm"]
self.lui()
# 类自述定义
def __repr__(self):
return """
注意使用语法 Usage:
python wget.py [-h,--help] [-g,--grasp URL]
-h,--help 打印出本帮助信息
-g,--grasp 页面URL地址
%s
"""%self.log
def lui(self):
opts, args = getopt.getopt(sys.argv[1:], "hg:o:", ["grasp=","outport=","help"])
print opts
for o, a in opts:
if o in ("-h", "--help"):
print self
sys.exit()
if o in ("-o", "--outport"):
print "输出为:: %s "%a
self.param[0] = "%s"%a
if o in ("-g", "--grasp"):
print "抓取页面:: %s "%a
self.param.append(a)
def grasp(self):
print self.param
print "开始从[%s]提取信息"%self.param[1]
#flob = urllib2.urlopen('http://mobile.wunderground.com/auto/mobile/global/stations/58362.html')
#flob = urllib2.urlopen('http://www.wunderground.com/global/stations/58362.html')
#shRequest = urllib2.Request("http://mobile.wunderground.com/auto/mobile/global/stations/58362.html")
shRequest = urllib2.Request(self.param[1])
shRequest.add_header("Accept-Language","zh-cn")
shRequest.add_header("Content-Type","text/html; charset=gb2312")
shRequest.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)")
fload = urllib2.urlopen(shRequest)
_fobj = fload.read()
#print _fobj
open(self.param[0],"w").write(_fobj)
print "已经输出为 %s"%self.param[0]
return _fobj
if __name__ == '__main__': # this way the module can be
w = wget() # imported by other programs as well
import timer
watch = timer.timer()
watch.start()
w.grasp()
print watch.stop()
Zeuux © 2025
京ICP备05028076号