관리-도구
편집 파일: robotparser.cpython-312.opt-1.pyc
� �Q�f�$ � � � d Z ddlZddlZddlZdgZ ej dd� Z G d� d� Z G d� d� Z G d � d � Z y)a% robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://www.robotstxt.org/norobots-rfc.txt � N�RobotFileParser�RequestRatezrequests secondsc �Z � e Zd ZdZdd�Zd� Zd� Zd� Zd� Zd� Z d� Z d � Zd � Zd� Z d� Zd � Zy)r zs This class provides a set of methods to read, parse and answer questions about a single robots.txt file. c �z � g | _ g | _ d | _ d| _ d| _ | j |� d| _ y )NFr )�entries�sitemaps� default_entry�disallow_all� allow_all�set_url�last_checked��self�urls �9/opt/alt/python312/lib64/python3.12/urllib/robotparser.py�__init__zRobotFileParser.__init__ s; � ������ �!���!���������S����� c � � | j S )z�Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. )r �r s r �mtimezRobotFileParser.mtime% s � � � � � r c �6 � ddl }|j � | _ y)zYSets the time the robots.txt file was last fetched to the current time. r N)�timer )r r s r �modifiedzRobotFileParser.modified. s � � � �I�I�K��r c �p � || _ t j j |� dd \ | _ | _ y)z,Sets the URL referring to a robots.txt file.� � N)r �urllib�parse�urlparse�host�pathr s r r zRobotFileParser.set_url6 s- � ����%�|�|�4�4�S�9�!�A�>��� �4�9r c � � t j j | j � }|j � }| j |j d� j � � y# t j j $ rT}|j dv rd| _ n4|j dk\ r |j dk rd| _ Y d}~yY d}~yY d}~yY d}~yd}~ww xY w)z4Reads the robots.txt URL and feeds it to the parser.zutf-8)i� i� Ti� i� N) r �request�urlopenr �readr �decode� splitlines�error� HTTPError�coder r )r �f�raw�errs r r% zRobotFileParser.read; s� � � 9����&�&�t�x�x�0�A� �&�&�(�C��J�J�s�z�z�'�*�5�5�7�8�� �|�|�%�%� &��x�x�:�%�$(��!����S��S�X�X��^�!%���� &4�� "�� &�s �)A* �*C�;C�Cc � � d|j v r| j �|| _ y y | j j |� y �N�*)� useragentsr r �append)r �entrys r � _add_entryzRobotFileParser._add_entryH s= � ��%�"�"�"��!�!�)�%*��"� *� �L�L����&r c �� � d}t � }| j � |D �]� }|s4|dk( r t � }d}n"|dk( r| j |� t � }d}|j d� }|dk\ r|d| }|j � }|s�h|j dd� }t |� dk( s��|d j � j � |d<