2005年12月22日 星期四 17:25
各位好, 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。 由于文件非常大,顺序读取非常的耗时,怎样实现效率高? 谢谢。 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051222/4946b7ca/attachment.html
2005年12月27日 星期二 13:43
If you use Linux, you can use the command such as
tail -n 1 #the last line
good luck
2005-12-27
On 12/22/05, Weigang LI <dimens at gmail.com> wrote:
>
> 各位好,
> 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。
> 由于文件非常大,顺序读取非常的耗时,怎样实现效率高?
>
> 谢谢。
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051227/20cbffef/attachment.html
2005年12月27日 星期二 14:40
我也想知道啊,这个问题困扰我多时了。 在05-12-22,Weigang LI <dimens at gmail.com> 写道: > > 各位好, > 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。 > 由于文件非常大,顺序读取非常的耗时,怎样实现效率高? > > 谢谢。 > > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > -- 以上,祝工作顺利,生活顺心。 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051227/66a8b80b/attachment-0001.htm
2005年12月27日 星期二 15:36
*lix下的tail命令是怎么做的呢? (用它查看多大的文件的最后部分都很快) ======== 2005-12-27 15:17:32 您在来信中写道: ======== 我也想知道啊,这个问题困扰我多时了。 在05-12-22,Weigang LI <dimens at gmail.com> 写道: 各位好, 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。 由于文件非常大,顺序读取非常的耗时,怎样实现效率高? 谢谢。 _______________________________________________ python-chinese Post: send python-chinese at lists.python.cn Subscribe: send subscribe to python-chinese-request at lists.python.cn Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn Detail Info: http://python.cn/mailman/listinfo/python-chinese -- 以上,祝工作顺利,生活顺心。 = = = = = = = = = = = = = = = = = = = = = = 致 礼! Zarz zarz at tom.com 2005-12-27 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051227/5a2cfe5e/attachment.htm
2005年12月27日 星期二 16:54
/* Print the last N_LINES lines from the end of file FD.
Go backward through the file, reading `BUFSIZ' bytes at a time (except
probably the first), until we hit the start of the file or have
read NUMBER newlines.
START_POS is the starting position of the read pointer for the file
associated with FD (may be nonzero).
END_POS is the file offset of EOF (one larger than offset of last byte).
Return true if successful. */
static bool
file_lines (const char *pretty_filename, int fd, uintmax_t n_lines,
off_t start_pos, off_t end_pos, uintmax_t *read_pos)
{
char buffer[BUFSIZ];
size_t bytes_read;
off_t pos = end_pos;
if (n_lines == 0)
return true;
/* Set `bytes_read' to the size of the last, probably partial, buffer;
0 < `bytes_read' <= `BUFSIZ'. */
bytes_read = (pos - start_pos) % BUFSIZ;
if (bytes_read == 0)
bytes_read = BUFSIZ;
/* Make `pos' a multiple of `BUFSIZ' (0 if the file is short), so that all
reads will be on block boundaries, which might increase efficiency. */
pos -= bytes_read;
xlseek (fd, pos, SEEK_SET, pretty_filename);
bytes_read = safe_read (fd, buffer, bytes_read);
if (bytes_read == SAFE_READ_ERROR)
{
error (0, errno, _("error reading %s"), quote (pretty_filename));
return false;
}
*read_pos = pos + bytes_read;
/* Count the incomplete line on files that don't end with a newline. */
if (bytes_read && buffer[bytes_read - 1] != '\n')
--n_lines;
do
{
/* Scan backward, counting the newlines in this bufferfull. */
size_t n = bytes_read;
while (n)
{
char const *nl;
nl = memrchr (buffer, '\n', n);
if (nl == NULL)
break;
n = nl - buffer;
if (n_lines-- == 0)
{
/* If this newline isn't the last character in the buffer,
output the part that is after it. */
if (n != bytes_read - 1)
xwrite_stdout (nl + 1, bytes_read - (n + 1));
*read_pos += dump_remainder (pretty_filename, fd,
end_pos - (pos + bytes_read));
return true;
}
}
/* Not enough newlines in that bufferfull. */
if (pos == start_pos)
{
/* Not enough lines in the file; print everything from
start_pos to the end. */
xlseek (fd, start_pos, SEEK_SET, pretty_filename);
*read_pos = start_pos + dump_remainder (pretty_filename, fd,
end_pos);
return true;
}
pos -= BUFSIZ;
xlseek (fd, pos, SEEK_SET, pretty_filename);
bytes_read = safe_read (fd, buffer, BUFSIZ);
if (bytes_read == SAFE_READ_ERROR)
{
error (0, errno, _("error reading %s"), quote (pretty_filename));
return false;
}
*read_pos = pos + bytes_read;
}
while (bytes_read > 0);
return true;
}
On 12/27/05, Zarz <zarz at tom.com> wrote:
>
> *lix下的tail命令是怎么做的呢? (用它查看多大的文件的最后部分都很快)
>
> ======== 2005-12-27 15:17:32 您在来信中写道: ========
>
>
> 我也想知道啊,这个问题困扰我多时了。
>
> 在05-12-22,Weigang LI <dimens at gmail.com> 写道:
> >
> > 各位好,
> > 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。
> > 由于文件非常大,顺序读取非常的耗时,怎样实现效率高?
> >
> > 谢谢。
> >
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to
> > python-chinese-request at lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
> >
>
>
> --
> 以上,祝工作顺利,生活顺心。
>
> = = = = = = = = = = = = = = = = = = = = = =
>
> 致
> 礼!
>
> Zarz
> zarz at tom.com
> 2005-12-27
>
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
--
In doG We Trust
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051227/954e9a63/attachment-0001.html
2005年12月27日 星期二 18:45
用mmap 在 05-12-27,bu shehui<bushehui at gmail.com> 写道: > If you use Linux, you can use the command such as > > tail -n 1 #the last line > > > good luck > > 2005-12-27 > > > > On 12/22/05, Weigang LI <dimens at gmail.com> wrote: > > > > > > 各位好, > > 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。 > > 由于文件非常大,顺序读取非常的耗时,怎样实现效率高? > > > > 谢谢。 > > _______________________________________________ > > python-chinese > > Post: send python-chinese at lists.python.cn > > Subscribe: send subscribe to > python-chinese-request at lists.python.cn > > Unsubscribe: send unsubscribe to > python-chinese-request at lists.python.cn > > Detail Info: > http://python.cn/mailman/listinfo/python-chinese > > > > > > > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to > python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to > python-chinese-request at lists.python.cn > Detail Info: > http://python.cn/mailman/listinfo/python-chinese > >
2005年12月28日 星期三 09:50
python 也可以seek啊,seek到文件尾,然后从后往前找回车符 在 05-12-27,hoxide Ma<hoxide at gmail.com> 写道: > 用mmap > > 在 05-12-27,bu shehui<bushehui at gmail.com> 写道: > > If you use Linux, you can use the command such as > > > > tail -n 1 #the last line > > > > > > good luck > > > > 2005-12-27 > > > > > > > > On 12/22/05, Weigang LI <dimens at gmail.com> wrote: > > > > > > > > > 各位好, > > > 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。 > > > 由于文件非常大,顺序读取非常的耗时,怎样实现效率高? > > > > > > 谢谢。 > > > _______________________________________________ > > > python-chinese > > > Post: send python-chinese at lists.python.cn > > > Subscribe: send subscribe to > > python-chinese-request at lists.python.cn > > > Unsubscribe: send unsubscribe to > > python-chinese-request at lists.python.cn > > > Detail Info: > > http://python.cn/mailman/listinfo/python-chinese > > > > > > > > > > > > _______________________________________________ > > python-chinese > > Post: send python-chinese at lists.python.cn > > Subscribe: send subscribe to > > python-chinese-request at lists.python.cn > > Unsubscribe: send unsubscribe to > > python-chinese-request at lists.python.cn > > Detail Info: > > http://python.cn/mailman/listinfo/python-chinese > > > > > > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > >
2005年12月28日 星期三 16:17
#last lines
def last_lines(filename, lines = 1):
#print the last several line(s) of a text file
"""
Argument filename is the name of the file to print.
Argument lines is the number of lines to print from last.
"""
block_size = 1024
block = ''
nl_count = 0
start = 0
fsock = file(filename, 'rU')
try:
#seek to end
fsock.seek(0, 2)
#get seek position
curpos = fsock.tell()
while(curpos > 0): #while not BOF
#seek ahead block_size+the length of last read block
curpos -= (block_size + len(block));
if curpos < 0: curpos = 0
fsock.seek(curpos)
#read to end
block = fsock.read()
nl_count = block.count('\n')
#if read enough(more)
if nl_count >= lines: break
#get the exact start position
for n in range(nl_count-lines+1):
start = block.find('\n', start)+1
finally:
fsock.close()
#print it out
print block[start:]
if __name__ == '__main__':
import sys
last_lines(sys.argv[0], 5) #print the last 5 lines of THIS file
在05-12-28,Vincent Wen <vincentwen at gmail.com> 写道:
>
> python 也可以seek啊,seek到文件尾,然后从后往前找回车符
>
>
> 在 05-12-27,hoxide Ma<hoxide at gmail.com> 写道:
> > 用mmap
> >
> > 在 05-12-27,bu shehui<bushehui at gmail.com> 写道:
> > > If you use Linux, you can use the command such as
> > >
> > > tail -n 1 #the last line
> > >
> > >
> > > good luck
> > >
> > > 2005-12-27
> > >
> > >
> > >
> > > On 12/22/05, Weigang LI <dimens at gmail.com> wrote:
> > > >
> > > >
> > > > 各位好,
> > > > 请问用什么样的方法读取一个超大文件的最后一行,或者文件末尾的n行。
> > > > 由于文件非常大,顺序读取非常的耗时,怎样实现效率高?
> > > >
> > > > 谢谢。
> > > > _______________________________________________
> > > > python-chinese
> > > > Post: send python-chinese at lists.python.cn
> > > > Subscribe: send subscribe to
> > > python-chinese-request at lists.python.cn
> > > > Unsubscribe: send unsubscribe to
> > > python-chinese-request at lists.python.cn
> > > > Detail Info:
> > > http://python.cn/mailman/listinfo/python-chinese
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > python-chinese
> > > Post: send python-chinese at lists.python.cn
> > > Subscribe: send subscribe to
> > > python-chinese-request at lists.python.cn
> > > Unsubscribe: send unsubscribe to
> > > python-chinese-request at lists.python.cn
> > > Detail Info:
> > > http://python.cn/mailman/listinfo/python-chinese
> > >
> > >
> >
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
> >
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051228/be1917e0/attachment-0001.html
2005年12月28日 星期三 18:16
php 有很多文本操作类,国内的几个PHP文本论坛已经可以讲把文本操作发挥到了极致^_^,所以你可以借鉴一下。都有很好的思想。我找这篇文章给你看 《*解决 textdb 核心问题:超负载与稳定性*》 http://www.phpchina.cn/bbs/viewthread.php?tid=788&page;=2 从四楼看 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20051228/9403be03/attachment-0001.htm
Zeuux © 2025
京ICP备05028076号