Extracting pure English from mixed documents

  
My child uses an English word memorizing software. Its custom vocabulary requires that only English word TXT files (one line per word) be imported. However, the word articles I downloaded from the Internet are basically mixed with words, part of speech, and Chinese comments, but I only need to copy the list of words (Figure 1). Manual operation is too much trouble, is there a good way? . [Solutions] Everyone knows that pressing the Alt key while copying in Word can achieve column copying of text content, because now the requirement is to copy the list of words in front of the article, so we can use the column copy method. However, the length of the words in the text is different, and it is impossible to directly select how many columns to copy. By observing, it can be found that there is a full-width space staggered between the English word and the Chinese comment, so using multiple spaces instead of implementing the word and comment separately can solve the problem. [Solution Method] Open the downloaded file in Word, and then copy the space between any word and the following comment, then click “Edit → Replace ”, paste the copied space in the "Find Content" , replace the single space with multiple spaces at “replace to ” and finally click “ Replace All > (Figure 2). Tip: In fact, many columns of copied articles can use a similar method to achieve replacement isolation, as long as you carefully observe. For example, each line of the above article is followed by part of speech (n, v or other), we just use “ n.” to replace the original “n.”, so that all the nouns in the article and the original words can be staggered Other part of speech can be operated in the same way. After the above operation, the words in the original article and the comments in the original will be disconnected by multiple spaces, as long as the original multi-line words are slightly modified (such as the deletion of the noun's noun explanation), so that the words are separated from the comments. It is open (Figure 3). Now press the Alt key, then use the mouse to copy all the words, and paste them into Notepad as prompted, thus completing the word extraction operation. This is not the end, because column copying produces a lot of white space (Figure 4). The cancellation of the space can be removed using the command prompt. Suppose the above file is named D:\\a.txt, start the command prompt and type "type D:\\a.txt", and continue to copy it in the command prompt window. Paste into Notepad to cancel all extra spaces (Figure 5). Use the command prompt to copy to cancel the space. This article comes from [System Home] www.xp85.com
Copyright © Windows knowledge All Rights Reserved