£Á°èZ¨Ä…–K§‚«“ô4“ÒÙ´dîfUÙÃÅ WKbyÊ¦•êŽ…È®FÒ¿ÊÎóCozá¬S@6{Í:›œêZÌ:Š•_%:¢¾¾~;‘Ã~èŠ©ÊÇí`ÔÑ©úë™µ'5I¿fš×WO%ø9¾«¾DK|€ùÍD”Ýs]nHÕ¶ê×Ó¼ãžªéUWŸÈË%DÒÕ¬ï‘]/Åcx ‰ï2ß]ä6G[]S£ÔÏ¯rs{úëóµmÒï#UQxo·õÞCe]"±/aÙ&Eã4ú9Jé_ÞåëdãöKë)AÞ ¯¹ægƒÛowÐø^d™ý½ßB7áyMä9ÜÖUã !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! U ʗRe��@s8ddlZddlZddlmZe�d�ZGdd�d�ZdS)�N�)�ProbingStates%[a-zA-Z]*[�-�]+[a-zA-Z]*[^a-zA-Z�-�]?c@sneZdZdZddd�Zdd�Zedd��Zd d �Zedd��Z d d�Z edd��Zedd��Z edd��ZdS)� CharSetProbergffffff�?NcCsd|_||_t�t�|_dS�N)�_state�lang_filter�logging� getLogger�__name__�logger)�selfr�r ��/builddir/build/BUILDROOT/alt-python38-pip-22.2.1-2.el8.x86_64/opt/alt/python38/lib/python3.8/site-packages/pip/_vendor/chardet/charsetprober.py�__init__+szCharSetProber.__init__cCstj|_dSr)r� DETECTINGr�rr r r�reset0szCharSetProber.resetcCsdSrr rr r r�charset_name3szCharSetProber.charset_namecCst�dSr)�NotImplementedError)r�byte_strr r r�feed7szCharSetProber.feedcCs|jSr)rrr r r�state:szCharSetProber.statecCsdS)Ngr rr r r�get_confidence>szCharSetProber.get_confidencecCst�dd|�}|S)Ns([-])+� )�re�sub)�bufr r r�filter_high_byte_onlyAsz#CharSetProber.filter_high_byte_onlycCsZt�}t�|�}|D]@}|�|dd��|dd�}|��sJ|dkrJd}|�|�q|S)u7 We define three types of bytes: alphabet: english alphabets [a-zA-Z] international: international characters [-ÿ] marker: everything else [^a-zA-Z-ÿ] The input buffer can be thought to contain a series of words delimited by markers. This function works to filter all words that contain at least one international character. All contiguous sequences of markers are replaced by a single space ascii character. This filter applies to all scripts which do not use English characters. N��r)� bytearray�INTERNATIONAL_WORDS_PATTERN�findall�extend�isalpha)r�filtered�words�word� last_charr r r�filter_international_wordsFs z(CharSetProber.filter_international_wordscCs�t�}d}d}t|��d�}t|�D]R\}}|dkrB|d}d}q$|dkr$||krr|sr|�|||��|�d�d}q$|s�|�||d ��|S) a[ Returns a copy of ``buf`` that retains only the sequences of English alphabet and high byte characters that are not between <> characters. This filter can be applied to all scripts which contain both English characters and extended ASCII characters, but is currently only used by ``Latin1Prober``. Fr�c�>r�s�