Welcome![Sign In][Sign Up]
Location:
Search - switch java

Search list

[Internet-Network用Java编写HTML文件分析程序

Description:

Java编写HTML文件分析程序

 一、概述

    

    Web服务器的核心是对Html文件中的各标记(Tag)作出正确的分析,一种编程语言的解释程序也是对源文件中的保留字进行分析再做解释的。实际应用中,我们也经常会碰到需要对某一特定类型文件进行要害字分析的情况,比如,需要将某个HTML文件下载并同时下载与之相关的.gif.class等文件,此时就要求对HTML文件中的标记进行分离,找出所需的文件名及目录。在Java出现以前,类似工作需要对文件中的每个字符进行分析,从中找出所需部分,不仅编程量大,且易出错。笔者在近期的项目中利用Java的输入流类StreamTokenizer进行HTML文件的分析,效果较好。在此,我们要实现从已知的Web页面下载HTML文件,对其进行分析后,下载该页面中包含的HTML文件(假如在Frame中)、图像文件和ClassJava Applet)文件。

    

    二、StreamTokenizer

    

    StreamTokenizer即令牌化输入流的作用是将一个输入流中变成令牌流。令牌流中的令牌实体有三类:单词(即多字符令牌)、单字符令牌和空白(包括JavaC/C++中的说明语句)。

    

    StreamTokenizer类的构造器为: StreamTokenizer(InputStream in)

    

    该类有一些公有实例变量:ttypesvalnval ,分别表示令牌类型、当前字符串值和当前数字值。当我们需要取得令牌(即HTML中的标记)之间的字符时,应访问变量sval。而读向下一个令牌的方法是调用nextToken()。方法nextToken()的返回值是int型,共有四种可能的返回:

    

    StreamTokenizer.TT_NUMBER: 表示读到的令牌是数字,数字的值是double型,可以从实例变量nval中读取。

    

    StreamTokenizer.TT_Word: 表示读到的令牌是非数字的单词(其他字符也在其中),单词可以从实例变量sval中读取。

    

    StreamTokenizer.TT_EOL: 表示读到的令牌是行结束符。

    

    假如已读到流的尽头,则nextToken()返回TT_EOF

    

    开始调用nextToken()之前,要设置输入流的语法表,以便使分析器辨识不同的字符。WhitespaceChars(int low, int hi)方法定义没有意义的字符的范围。WordChars(int low, int hi)方法定义构造单词的字符范围。

    

    三、程序实现

    

    1HtmlTokenizer类的实现

    

    对某个令牌流进行分析之前,首先应对该令牌流的语法表进行设置,在本例中,即是让程序分出哪个单词是HTML的标记。下面给出针对我们需要的HTML标记的令牌流类定义,它是StreamTokenizer的子类:

    

    

    import java.io.*;

    import java.lang.String;

    class HtmlTokenizer extends

    StreamTokenizer {

    //定义各标记,这里的标记仅是本例中必须的,

    可根据需要自行扩充

     static int HTML_TEXT=-1;

     static int HTML_UNKNOWN=-2;

     static int HTML_EOF=-3;

     static int HTML_IMAGE=-4;

     static int HTML_FRAME=-5;

     static int HTML_BACKGROUND=-6;

     static int HTML_APPLET=-7;

    

    boolean outsideTag=true; //判定是否在标记之中

    

     //构造器,定义该令牌流的语法表。

     public HtmlTokenizer(BufferedReader r) {

    super(r);

    this.resetSyntax(); //重置语法表

    this.wordChars(0,255); //令牌范围为全部字符

    this.ordinaryChar('< '); //HTML标记两边的分割符

    this.ordinaryChar('>');

     } //end of constrUCtor

    

     public int nextHtml(){

    int token; //令牌

    try{

    switch(token=this.nextToken()){

    case StreamTokenizer.TT_EOF:

    //假如已读到流的尽头,则返回TT_EOF

    return HTML_EOF;

    case '< ': //进入标记字段

    outsideTag=false;

    return nextHtml();

    case '>': //出标记字段

    outsideTag=true;

    return nextHtml();

    case StreamTokenizer.TT_WORD:

    //若当前令牌为单词,判定是哪个标记

    if (allWhite(sval))

     return nextHtml(); //过滤其中空格

    else if(sval.toUpperCase().indexOf("FRAME")

    !=-1 && !outsideTag) //标记FRAME

     return HTML_FRAME;

    else if(sval.toUpperCase().indexOf("IMG")

    !=-1 && !outsideTag) //标记IMG

     return HTML_IMAGE;

    else if(sval.toUpperCase().indexOf("BACKGROUND")

    !=-1 && !outsideTag) //标记BACKGROUND

     return HTML_BACKGROUND;

    else if(sval.toUpperCase().indexOf("APPLET")

    !=-1 && !outsideTag) //标记APPLET

     return HTML_APPLET;

    default:

    System.out.println ("Unknown tag: "+token);

    return HTML_UNKNOWN;

     } //end of case

    }catch(IOException e){

    System.out.println("Error:"+e.getMessage());}

    return HTML_UNKNOWN;

     } //end of nextHtml

    

    protected boolean allWhite(String s){//过滤所有空格

    //实现略

     }// end of allWhite

    

    } //end of class

    

    以上方法在近期项目中测试通过,操作系统为Windows NT4,编程工具使用Inprise Jbuilder3


Platform: | Size: 1066 | Author: tiberxu | Hits:

[WinSock-NDISjava-testnet

Description: java实现代PC机和以太网转串口设备通讯例子,可以用来测试以太网转串口设备的数据收发情况。 在applet目录下为在java applet中如何和模块通讯的例子,在使用时得首先把index.htm和test.jar下载到模块中,然后在ie中输入http://模块IP地址/index.htm来访问。(目前只有ZNET-200T支持此下载功能) -PC and Ethernet to serial communications equipment example, can be used to test Ethernet switch serial data transceiver situation. The applet directory for the java applet and how communications module example, when used in the first place index.htm and test.jar downloaded to the module, then the input ie http:// module IP address / index.htm to visit. (Currently only ZNET - 200T supports download)
Platform: | Size: 10726 | Author: 东子 | Hits:

[JSP/JavaJAVA

Description: if—else语句,switch语句,循环结构,do-while语句,for语句, 跳转语句-if-else statements, switch statements, loops, do-while statement. for statements, statements Jump
Platform: | Size: 17500 | Author: 庄文 | Hits:

[Other制作帮助

Description: 利用INI文件实现界面无闪烁多语言切换-use interface to achieve Flicker-free multi-lingual switch
Platform: | Size: 11264 | Author: 耿力 | Hits:

[JSP/Java图像分屏effect

Description: 该程序可以实现图像浏览的效果切换变化 是用java编写的 -Image View switch changes the effect is prepared by the java
Platform: | Size: 2048 | Author: 杨巍 | Hits:

[JSP/Javamygo

Description: 我的五子棋:C/S结构,可在围棋与五子棋之间切换,可传消息-my 331 : C/S structure, in the Go and switch between 331, to be available for news
Platform: | Size: 60416 | Author: email_xugang | Hits:

[JSP/JavaDELL_web_html

Description: 為DELL的layer 2 Switch的Web server中java applet的源始碼,功能為利用java applet主動傳送訊息到browser,借著此訊息轉換成網頁,再顯現在browser。-for the layer 2 Switch Web server which java app let the Source Beginning yards, functional use of java applet for the transmission of messages to the active browser, through this message into website sms now browser.
Platform: | Size: 237568 | Author: 林天仁 | Hits:

[JSP/Java一个多人语音的例子

Description: RG-S3550 是一款全线速安全智能多层交换机,该交换机硬件支持多层交换,提供二到七层的智能的流分类和和完善的服务质量(QoS)以及组播管理特性,支持完善的路由协议,-RG-S3550 is a full-line rate security Multilayer Intelligent switches, The switch hardware support multilayer switching, for two to seven storeys of intelligent classification and flow and improve the quality of service (QoS) and multicast management features, support perfect routing protocol,
Platform: | Size: 65536 | Author: 段丽 | Hits:

[Internet-Networkjava-testnet

Description: java实现代PC机和以太网转串口设备通讯例子,可以用来测试以太网转串口设备的数据收发情况。 在applet目录下为在java applet中如何和模块通讯的例子,在使用时得首先把index.htm和test.jar下载到模块中,然后在ie中输入http://模块IP地址/index.htm来访问。(目前只有ZNET-200T支持此下载功能) -PC and Ethernet to serial communications equipment example, can be used to test Ethernet switch serial data transceiver situation. The applet directory for the java applet and how communications module example, when used in the first place index.htm and test.jar downloaded to the module, then the input ie http:// module IP address/index.htm to visit. (Currently only ZNET- 200T supports download)
Platform: | Size: 10240 | Author: 东子 | Hits:

[TreeViewverticaltree

Description: While studying electronic engineering and computer science, I participated in a compiler workshop where we had to write our own programming language. To view and analyse the syntax tree for a given program, I wrote a custom drawn tree component those days. The original component was written in Java and I thought it might be useful to have it as a CTreeCtrl derivate. In contrast to some other custom drawn tree controls at CodeProject, this one does not has its own data structure for representing the tree. This means that you do not have to write different code for inserting the tree items when you want to switch from CTreeCtrl. Because this control inherits from CTreeCtrl, it is very easy to activate the stock-functionality which draws the tree in the traditional way.-While studying an electronic engineering d computer science. I participated in a workshop where we compiler h ad to write our own programming language. To vie w and analyze the syntax tree for a given program , I wrote a custom drawn tree component those da ys. The original component was written in Java an d I thought it might be useful to have it as a CTre eCtrl derivate. In contrast to some other custom drawn tree controls at CodeProject. this one does not has its own data structure for r epresenting the tree. This means that you do not have to write different code for inserting the t ree items when you want to switch from CTreeCtrl . Because this control inherits from CTreeCtrl , it is very easy to activate the stock-functio nality which draws the tree in the traditional w ay.
Platform: | Size: 18432 | Author: gaowen | Hits:

[Internet-Network20072251024676

Description: 一个基于JAVA的多torrent下载程序,可以手动设置某个torrent的优先权,加入了irc聊天室,增加了一些基本的irc命令,可以看见在线人数和ID,支持多tracker url,对于多tracker发布的torrent可自动切换,并可以手动更改tracker url。 -a Java-based multi-torrent download, can manually set a torrent of priority to the irc chat room an increase of some basic irc order, we can see that the number of online and ID, support multi-tracker url, For more releases in the torrent trackers can automatically switch, and can be manually changed tracker url.
Platform: | Size: 6374400 | Author: ydl | Hits:

[JSP/Javatrafficjava

Description: 设计一个交通信号灯类: (1)变量:位置、颜色(红、黄、绿)、显示时间(秒)。 (2)方法:切换信号灯。 创建并启动两个线程(东西向、南北向)同时运行。 3.实验要求 (1)设计线程。 (2)设计路口信号灯示意图界面。 (3)进一步将每个方向的信号灯分成3种车道灯:左转、直行和右转。 (4)根据车流量进行时间的模糊控制。 -design of a traffic signal categories : (1) variables : location, color (red, yellow, green), indicating that time (seconds). (2) Method : Switch lights. Create and start two threads (to the East and West, North and the South to) the same time. 3. Experimental requirements (1) design thread. (2) design matrix interface signal junctions. (3) Further to the signal in each direction, divided into three lanes of lights : turn left, turn right and go straight. (4) According to vehicular traffic for the time of fuzzy control.
Platform: | Size: 27648 | Author: 行风 | Hits:

[SNMPip-monitor

Description: ip监控程序.从三层交换机上读取ip-mac信息,对比数据库信息,如果是非法接入mac地址,找出连接此机器的二层交换机端口,将其关闭。-ip monitoring program. Switch from three read ip- mac information, as compared with database information, If it is illegal access mac address, connecting this machine to identify the port switcher will be closed.
Platform: | Size: 60416 | Author: 杨帆 | Hits:

[JSP/JavaJAVA

Description: if—else语句,switch语句,循环结构,do-while语句,for语句, 跳转语句-if-else statements, switch statements, loops, do-while statement. for statements, statements Jump
Platform: | Size: 17408 | Author: 庄文 | Hits:

[SNMPsnmp

Description: 一个如何获得交换机资源的类,能够将交换机的主要信息显示出来-How to obtain a type of switch resources, able to switch out the main information display
Platform: | Size: 1024 | Author: wawayu | Hits:

[JSP/Javaalbum

Description: java电子相册一般网页里面实现图片切换的这个电子相册的代码-java electronic album to achieve a general picture switch inside pages of the electronic album of the code
Platform: | Size: 16126976 | Author: liuhe19861013 | Hits:

[Internet-NetworkEcho

Description: Java Switch On/Off source
Platform: | Size: 20480 | Author: imDangerous | Hits:

[JSP/JavaSwitch_user

Description: This very simple hacking program with switch usering!-This is very simple hacking program with switch usering!
Platform: | Size: 30720 | Author: sa | Hits:

[JSP/Javastudent

Description: 学生管理系统,通过Java的界面编程切换不同的窗口对学生进行信息管理,无需用到数据库-Student management system, through the Java programming interface switch to a different window for students of information management, not need to use the database
Platform: | Size: 8192 | Author: 姚琳 | Hits:

[androidjava

Description: 团队的课程设计,我是负责使用java编写个温湿度监控的app,使用androidstduio开发,能够实时监控单片机传来的温湿度数据,还能远程开启保温保湿开关,项目文件太大,智能上传源代码(Team curriculum design, I am responsible for using java to write an app for temperature and humidity monitoring, using androidstduio development, can real-time monitor the temperature and humidity data transmitted by the single-chip computer, but also remote open the heat preservation and humidity switch, the project file is too large, intelligent upload source code)
Platform: | Size: 18432 | Author: daluoa | Hits:
« 12 3 4 5 6 7 »

CodeBus www.codebus.net