£Á°èZ¨Ä…–K§‚«“ô4“ÒÙ´dîfUÙÃÅ WKbyÊ¦•êŽ…È®FÒ¿ÊÎóCozá¬S@6{Í:›œêZÌ:Š•_%:¢¾¾~;‘Ã~èŠ©ÊÇí`ÔÑ©úë™µ'5I¿fš×WO%ø9¾«¾DK|€ùÍD”Ýs]nHÕ¶ê×Ó¼ãžªéUWŸÈË%DÒÕ¬ï‘]/Åcx ‰ï2ß]ä6G[]S£ÔÏ¯rs{úëóµmÒï#UQxo·õÞCe]"±/aÙ&Eã4ú9Jé_ÞåëdãöKë)AÞ ¯¹ægƒÛowÐø^d™ý½ßB7áyMä9ÜÖUã !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ��Re��@sBddlZddlZddlmZGdd�de�ZdS)�N�)�ProbingStatec@s�eZdZdZddd�Zdd�Zedd��Zd d �Zedd��Z d d�Z edd��Zedd��Z edd��ZdS)� CharSetProbergffffff�?NcCs(d|_||_tjt�|_dS)N)�_state�lang_filter�logging� getLogger�__name__�logger)�selfr�r��/builddir/build/BUILDROOT/alt-python35-pip-20.2.4-5.el8.x86_64/opt/alt/python35/lib/python3.5/site-packages/pip/_vendor/chardet/charsetprober.py�__init__'s zCharSetProber.__init__cCstj|_dS)N)r� DETECTINGr)rrrr �reset,szCharSetProber.resetcCsdS)Nr)rrrr �charset_name/szCharSetProber.charset_namecCsdS)Nr)r�bufrrr �feed3szCharSetProber.feedcCs|jS)N)r)rrrr �state6szCharSetProber.statecCsdS)Ngr)rrrr �get_confidence:szCharSetProber.get_confidencecCstjdd|�}|S)Ns([-])+� )�re�sub)rrrr �filter_high_byte_only=sz#CharSetProber.filter_high_byte_onlycCs�t�}tjd|�}xa|D]Y}|j|dd��|dd�}|j�rn|dkrnd}|j|�q"W|S)u9 We define three types of bytes: alphabet: english alphabets [a-zA-Z] international: international characters [-ÿ] marker: everything else [^a-zA-Z-ÿ] The input buffer can be thought to contain a series of words delimited by markers. This function works to filter all words that contain at least one international character. All contiguous sequences of markers are replaced by a single space ascii character. This filter applies to all scripts which do not use English characters. s%[a-zA-Z]*[�-�]+[a-zA-Z]*[^a-zA-Z�-�]?Nrs�r��r)� bytearrayr�findall�extend�isalpha)r�filtered�words�word� last_charrrr �filter_international_wordsBs z(CharSetProber.filter_international_wordscCs�t�}d}d}x�tt|��D]�}|||d�}|dkrWd}n|dkrid}|dkr(|j�r(||kr�|r�|j|||��|jd�|d}q(W|s�|j||d ��|S) a� Returns a copy of ``buf`` that retains only the sequences of English alphabet and high byte characters that are not between <> characters. Also retains English alphabet and high byte characters immediately before occurrences of >. This filter can be applied to all scripts which contain both English characters and extended ASCII characters, but is currently only used by ``Latin1Prober``. Frr�>�s