Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
蔣以仁 Search Problem Search Query: Jaguar Jaguar(Animal) Jaguar(Automobile) Jaguar(Watch) Jaguar(OS) Monika Henzinger, Search Technologies for the Internet Science, Vol. 317. no. 5837, 468 – 471, 27 July 2007 2 檔案的目的在為未來創造知識 …records are recognized as agency assets used to underpin current business and legal needs, as well as the basis for a knowledge management system to meet future goals. – HOWARD P. LOWELL Director Modern Records Programs NARA 資料探勘走向決策支援 彙整同一性質資料 資料探勘以產生關聯相依規律 視覺化顯示協助專家研判主題 定義處理指引方便建立決策支援 KDD Process Interpretation/ Evaluation Data Mining Transformation Preprocessing Knowledge Pattern Selection Transformed Data Preprocessed Data Target Data Data Warehouse BI 結構 metadata Other sources Operational DBs Data Sources Monitor & Integrator Extract Complete Data Transform Warehouse Load Refresh Data Marts OLAP Server Server 1. Comprehensive Performance Management 2. Analysis 3. Query 4. Reports 5. Data mining Tools Business Intelligence 8 Gaining market intelligence from news feeds 9 Sreekumar Sukumaran and Ashish Sureka Signal Dr. Bhandari said, “I first noticed this when the New York Times did an analysis after the fact showing that early indications of the FordExplorer-Firestone-tire problem went undetected in a federal database. Recently, a similar analysis by CNN showed that early indications of security problems at Logan, Dulles, and Newark airports, went undetected in a federal database well before the September 11 tragedy. It is clear that the cost of missing these patterns is too high to be ignored.” 資訊整合 Mining target: individual text Mining unit: >texts >category labeled items extracted from text using NLP Original Data Structured Data Call Taker: James Date: Aug. 30, 2002 Duration: 10 min. CustomerID: ADC00123 Q: cust sys has stopped working. A: checked cust bios and it need updated. … Unstructured Data Meta Data Category Category Dictionary Synonym Dictionary Item Visualization & Interactive Mining [Call Taker] James [Date] 2002/08/30 [Duration] 10 min. [CustomerID] ADC00123 Mining Linguistic Analysis [Noun] Customer [Software] BIOS [Subj...Verb] customer system..stop [SW..Problem] BIOS..need Tagging Dependency Analysis Named Entity Extraction Intention Analysis IBM TAKMI (Nasukawa, Nagano,1999) 醫學文獻告訴我什麼 醫學文獻來源:Medline 可發現疾病、症狀與藥物或化合物的因果關 聯 1. Swanson DR. Searching natural language text by computer. Machine indexing and text searching offer an approach to the basic problems of library automation. Science. 132:1099–1104, 21 Oct. 1960. 2. Swanson DR. Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med. 30(1):7–18, 1986. 3. Swanson, D.R., Complementary structures in disjoint science literatures. In A. Bookstein, et al (Eds.), SIGIR91: Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval Chicago, Oct 13-16, 280-289, 1991. 偏頭痛? Stress is associated with migraines Stress can lead to loss of magnesium Calcium channel blockers prevent some migraines Magnesium is a natural calcium channel blocker Spreading cortical depression (SCD) is implicated in some migraines High levels of magnesium inhibit SCD Migraine patients have high platelet aggregability Magnesium can suppress platelet aggregability Smalheiser, N.R. & Swanson, D.R.. Assessing a gap in the biomedical literature: Magnesium deficiency and neurologic disease. Neuroscience Research Communications, 15, 1-9, 1994. 文獻實証 All Migraine Research migraine CCB PA SCD stress All Nutrition Research magnesium 找出新線索 雷諾氏現象 Raynauds Hypothesis generation Fish oils vasoconstrictions 血管收縮 platelet aggregation 血小板活化凝集 blood viscosity 粘滯血症 Intermediate concepts Swanson, D.R. (1994). Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med. Autumn;30(1):7-18, 1986 . 不得不提到的技術-自然語言處理 NLP 始於1948年倫敦Birkbeck College字典查詢系統 1949- Warren Weaver之American Interest破解 密碼 1950- 機器翻譯 (German to English, Russian to English) 1966~ 雷聲大雨點小 機器翻譯字對字 (Dr. Eye?) NLP brought the first hostility of research funding agencies. NLP gave AI a bad name before AI had a name. 資訊巨幅成長 2006 年數位資訊量已達 1,610 億GB( 相當 於 161 Exabytes) 。 IDC 預估從 2006 至 2010 年間,資訊成長 量約為六倍。 2010 年時,有近 70% 的數位世界的資訊 是由個人使用者所創造,而至少有 85% 的 資訊量是組織企業必須負起資訊安全、隱 私、可靠性及相關法規遵從的責任。 The Expanding Digital Universe, http://www.emc.com/leadership/digitaluniverse/expanding-digital-universe.htm 100 網路訊息 新聞報導 專利 電子郵件 文件… 90 80 70 60 50 非結構資料 40 結構化資料 30 20 10 Oracle 0 資料量 市場化價值 Search Engine Roadmap Exploratory Search Affiliation (Topic Relevance Analysis) Dictionary/Ontology Wikis Full Text Search Including complex Boolean search Clustering/Categorize Synonym/Anatomy Document Abstraction Custom Search Knowledge collaborative search Filtering Crawler Integrate other search engines Summarization (mobile) Multiple abstracts organization Search log recorder Personal tagging Sharing Forum/Blogger Customized meta-search Taxonomy search Natural language processing/ understanding Web Page Features Extraction (semi- and un-structure) Feature Ranking Feature Mapping Recommendation Taxonomy Search Visual Technology Ajax Topological Graphics Web 2.0 or upper Collaborative Filting Visualization 網路搜尋引擎 以離線方式抓去網頁,透過建立一種內部資料儲 存方式,稱之為 (反轉;inverted) 索引,儲存資料 線上檢索 Monika Henzinger, Search Technologies for the Internet Science, Vol. 317. no. 5837, 468 – 471, 27 July 2007 Search Engine Problems Index Comprehensiveness Relevance Deterministic Search Search Query Jaguar(Animal) Jaguar(Automobile) Jaguar(Watch) Jaguar(OS) Problem: Scalable J, Beall, The Weaknesses of Full-Text Searching. The Journal of Academic Librianship, 34(5):438-444, 2008. 搜尋引擎之演進 第一代– 只使用“網頁內”文字資料 字頻, 語言 第二代--使用非頁內, 網路上特殊屬性資料 連接分析 點擊資料 (What results people click on) 下錨文字 (Hyperlinks, How people refer to this page) 1995-1997 AV, Excite, Lycos, etc From 1998. Made popular by Google but everyone now 第三代– 回答 “查詢所知” 語意分析 -- what is this about? 專注使用者所需, 非僅僅查詢 關鍵資料之推定 輔助使用者 整合搜尋及文件分析 Still experimental 網路搜尋問題 問題 查詢過於簡短不夠精確 同意與相似字詞讓查詢匹配度難預期 網頁作者混淆式安排, 讓搜尋結果差強人意 使用者需要額外功能, 如過濾器 解決 增加理解 結果排列 Trailblazer Car Basketball team Monika Henzinger, Search Technologies for the Internet Science, Vol. 317. no. 5837, 468 – 471, 27 July 2007 Expand Crawler Basic Crawler Wrapper/Clipper XHTML, DHTML Parser Feature Transformation XML Parser Structural Features Extraction HTML Parser … Scheduling Clipper Windows Specify Hertrix Crawler Unstructured Document Features Extractions (NLP) Feature Mapping Ontological Organization Specific Feature Parse … Filtering … Ontology Machine Learning Approach Semantic Crawler P2P Knowledge Sharing Crawler Crawler Classes Annotated Crawler Craw with specific terms/phases Crawler Outside Search Engine Supporting Information from original sources & Reference contents Filter Data Sources Learner Relevant Information Feed into Reference List Authoring User process Filtering NE records Web Crawler Classes Page/Section/Block/Item Specify GUI Specification System Scheduler Crawler Notify for manually tune Logger Adaptor Log Named Entities Recognition Comparator Compare the extracted structure between two stages Feature Extractor Repository 時序性資訊彙整 事件分析 分群檢索 1. 2. Walter Warnick, Problems of Searching in Web Databases. Science . Vol. 316. no. 5829, 1284, June 2007. I-Jen Chiang, Discover the Semantic Topology in High-Dimensional Data, Expert Systems with Applications, 33 (1), September, 2007. 技術架構略圖 d1 d2 … dm t1 t2 … tn w11 w12… w1n w21 w22… w2n …… wm1 wm2… wmn Term similarity 分群 Doc similarity Term Weighting Tokenized text Stemming & Stop words Sentence selection t t tttt tt t t tt dd dd dd dd dd d dd d Vector centroid 摘要 d Raw text META-DATA/ ANNOTATION 分類/文件追蹤 Salton’s Vector Space Model 一袋子字 (Bag of Words) A Cosine Similarity Jaccard index θ B Jaccard similarity coefficient Tanimoto coefficient G. Salton, A. Wong, and C. S. Yang, "A Vector Space Model for Automatic Indexing," Communications of the ACM, vol. 18, nr. 11, 613–620, 1975. Curse of Dimensions 1 句意不清: I saw the man on the hill with telescope Using a telescope, I saw a man who was on a hill. I saw the man on the hill with telescope I saw a man who was on a hill and who had a telescope. I saw the man on the hill with telescope I saw a man who was on the hill that has a telescope on it. I saw the man on the hill with telescope 自然語言處理新方向 The delegation, which training sentences included the commander of the U.N. troops in Bosnia, Lt. Gen. Sir Michael Rose, went to the Serb stronghold of Pale, Speech Speech near Sarajevo, for Recognition talks with Bosnian Text Serb leader Radovan Karadzic. Training Program answers NE Models Entities Extractor •Prior to 1997 - no learning approach competitive with handbuilt rule systems •Since 1997 - Statistical approaches (BBN (Bikel et al. 1997), NYU, MITRE, CMU/JustSystems) achieve state-of-the-art performance 1. 2. 3. 4. The delegation, which included the commander of the U.N. troops in Bosnia, Lt. Gen. Sir Michael Rose, went to the Serb stronghold of Pale, near Sarajevo, for talks with Bosnian Serb leader Radovan Karadzic. 地點 人物 組織 M. Marcus. New trends in natural language processing: Statistical natural language processing. PNAS. 92. 10052-10059, 1995. Current Trends in Biomedical Natural Language Processing, Ohio State University, June 2008 Tanveer Siddiqui. National Language Processing and Information Retrieval. Oxford Univ Press, 2008. Yorick Wilks. Natural Language Processing as a Foundation of the Semantic Web. Foundations and Trends® in Web Science, 1(3-4). 199-327, 2009. 知識地圖 I-Jen Chiang 事件追蹤 資訊檢索 知識概念 議題內事件發生的相依關聯 查詢以瞭解議題內相關論點 論點角度(依機關、案由等) 議題內某事件所受之影響 議題內某事件之影響 依時間追蹤事件處理狀況 深入細節以瞭解現象、處置 權衡輕重以瞭解處事準則 事件追蹤分析議題主軸變化 組合屋議題下 政府震災地區災民住宅重建信用保證基金一千億讓災民取得貸款 組合屋議題下 重建條例訂定含括工程、獎助金 Integrated BI Systems ETL Complete Data Warehouse RDBMS Structural Data File System XML XML Text tagger & Annotator ETL DBMS Intermedia Data EA Unstructured Data Legacy CMS Scanned Documents Email Sreekumar Sukumaran and Ashish Sureka 標註 Date Acquiring Organization Acquisition Event Acquired Organization On November 16, 2005, IBM announced it had acquired Collation, a privately held company based in Redwood City, California for undisclosed amount. Place Amount Output to RDBMS Text Annotator Date Organization Place Amount Nov. 16 IBM Redwood City, CA Undisclosed XML output On <Date>November 16, 2005</Date>, <ACQUIRING ORG>IBM</ACQUIRING ORG> announced it had <ACQUISITION EVENT>acquired</ACQUISITION EVENT> <ACQUIRED ORG>Collation</ACQUIRED ORG>, a privately held company based in <PLACE>Redwood City, California</PLACE> for <AMOUNT>undisclosed</AMOUNT> amount. McIlraith, S.A., Son, T.C., Zeng, H.: Semantic web services. IEEE Intelligent Systems 16, 46–53, 2001 整合式BI系統 Intermedia Data ETL Complete Data Warehouse RDBMS Text tagger & Annotator ETL Structural Data DBMS File System XML XML EA Unstructured Data Legacy CMS Sreekumar Sukumaran and Ashish Sureka Scanned Documents Email Knowledge-based Persistent Archives Knowledge Repository for Rules Access Rules - KQL Knowledge Relationships Between Concepts Manage XTM DTD Ingest Knowledge or Topic-Based Query Attributes Semantics Information Repository EMCAT / MIX Information XML DTD (Topic Maps / Model-based Access) Attribute- based Query Fields Containers Folders Storage (Replicas, Persistent IDs) GRIDS Data MCAT/HDF (Data Handling System - Storage Resource Broker) Feature-based Query NExIOM Ontology Models Electrical Power Electrical Power Analysis Analysis W S Structure and Connectivity W S W S Trade-Offs Analysis Risk Modeling Mapping WS Mapping WS Ontology Authoring Mapping TopSCAPE WS COVE Discipline Ontology Models WS Mapping Translation Models W S W S W S Interaction Logic Application Logic Semantic Interface W S Cost Modeling Semantic Application Performance Modeling NASA iLoC SBA Workspace SI SI IL AL BL SI IL AL BL SI IDT DB T1 RFx DB T2 Text Mining for Hypertext Creation A general topic Concept map Subtopic 1 Subtopic i Subtopic M ... Doc 1 Doc 2 Hypertext Doc N Type of Links Term Term Links DocTerm Links A general topic TermDoc Links Subtopic 1 Subtopic i Subtopic M ... Doc 1 Doc 2 Doc N Doc Doc Links Example from an Enterprise Architecture Process Ontology Agent Role Process Task Measure Goal FEA-RMO delivers “Line of Sight” fea: Mission fea: intentOf prm: GenericMeasurementIndicator fea: Agency prm: PerformanceMeasure brm: provides fea: hasIntent prm:hasIndicator brm: SubFunction brm: hasProcess brm: Process brm: usesResource brm: Resource prm:hasSpecialization brm: hasPerformance brm: realizedWith brm: hasCustomer fea: Customer prm: OperationalizedMeasurementIndicator srm: Service 病歷紀錄整合 ROYAL MARSDEN NHS TRUST - PATIENT CASE NOTE ######:MRS ##### ####### 27 Aug 1998 Seen in the Follow Up Staging Clinic This 65 year old lady has been reviewed in the Breast staging clinic. As you know, she was originally diagnosed with a carcinoma of the left ROYAL MARSDEN NHS TRUST - PATIENT CASE NOTE breast in 1974 and treated with a total mastectomy. This was followed ######:MRS ##### ####### with MEFUP chemotherapy. In 1982 she noticed a lump in the infraclavicular region which was excised and this was followed by ROYAL MARSDEN NHS TRUST - DIAGNOSTIC - CT REPORT radiotherapy. In 1994 she RADIOLOGY developed a tumour in the chest cavity that 15 Dec 1993 General Surgical was diagnosed ######:#######,MRS #####with a CT guided biopsy and this was treated with VAC I reviewed this patient in clinic today. She has beenMARSDEN followed ROYAL chemotherapy and radiotherapy to the mediastinum. Since 1994 she had NHS TRUST - PATIENT CASE NOTE Exam 18 Dec Examination LIVER/THORAX/ABDOMEN/PELVIS noticed a slight deterioration and earlier this year she had problems up for a left breast carcinoma for which she was treated with a ######:MRS ##### ####### Exam Number [NUM] with occasional episodes of vomiting, nausea and general lethargy. She mastectomy. She had a prosthesis removed last year and has had Date of Birth 17 May 1933 some improvement in the symptoms of chest wall discomfort since 24intermittently. Jan 1997 then although she still gets quite sharp pains Ref Seen in the Chemotherapy Clinic (TPFRIDAY) [HCA1] Clinical She has been reviewed in the pain clinic local where she I sawto ##### today was found to have lymphadenopathy in the right supraclavicular fossa and was treated with Arimidex. Since being on Arimidex there was OUTPATIENT originally stablisation of her disease but recently it appears that the node has started to enlarge. in clinic. I am very pleased to say that she has BR had Verified by [HCA2] On examination today, she has a 1.5x1cm lymph node in the right lives but has not had much relief of her symptoms. She feels supraclavicular fossa and an essence of thickening probably due to a complete response in her superior mediastinum and rightDIAGNOSIS: Carcinoma of breast. previous therapy in the left supraclavicular fossa. She also has though that she can bear with these and does not want any CT scans have been obtained through chest, abdomen pelvis with oral radiation changes in the lungand which produced some physical sign at both supraclavicular fossa lymphadenopathy. There is some minimal thickening further intervention at present. On examination today there is no sign of remaining recurrence ofin herthe disease. Chest and abdominal examination were We might fact it unremarkable. is felt that this will see her again in a year's time. contrast only. soft tissues around the superior mediastinum and in bases and there was no evidence of abdominal organomegaly. Her recent staging investigations show that she has C5 carcinoma cells There is thickening in the left clavicular fossa and small- now be related to previous present in the lymph node fine needle aspirate. A right mammogram is volume residual abnormalities in the mediastinum. Comparison unremarkable. An ultrasound of the liver is wasmade normal and a chest x-ray showed thickening present in the left axilla due to radiotherapy. To be honest, however, symptomatically there withhas thebeen most recent scan (21.7.95) and some theresoft is tissue no discernible change 28/03/2003, 10:35:26 little in the way of benefit with overall palliative response by of CT no criteria. previous therapy. There is also some loss of volume in the left upper zone but no lung nodules seen. A bone scan shows evidence of Lung changes, which may have been relatedchanges to radiotherapy, nowofless degenerative but no specific are evidence bony metastases. Her change. She is tolerating the treatment fairly well. Interestingly she extensive. thyroid function tests show that the TSH is 0.12 and her free T3 are 4 which indicates that the TSH is slightly low. This does not amount to has had virtually complete alopecia with the treatment. SheThere has are been on no abnormally-enlarged nodes in the retroperitoneum primary hypothyroidism but it would be worth repeating the thyroid warfarin for about the same amount of time and I wonder whether this are no focal hepatic or pelvis. There masses. function tests in three months time. it appears that the patient has stable disease on Arimidex CONCLUSION: No CT evidenceOverall, of disease progression. may be partly responsible. We have given her a fourth cycle of apart from in the right supraclavicular fossa. The Arimidex is not treatment today and we will see her in three weeks for consideration of 28/03/2003, 12:35:06 holding the disease completely and we feel that the best approach to management would be to consider some radiotherapy to the right her fifth. supraclavicular fossa. She has previously had radiation therapy to the 28/03/2003, 10:44:20 left clavicular region and mediastinum. We have discussed performing a CT scan of the thorax but she was unable to lie flat for the duration of the investigation some months ago. We shall ask our radiotherapy colleagues to review her and consider her for therapy. We shall review her again in the follow up clinic in six weeks time. 28/03/2003, 10:50:25 疾病診斷 Consider a 62-year-old man with 3 months history of severe back pain. His weight remained stable. CBC and routine biochemistry were normal. ESR was 52 mm / hour. An x-ray of the lumbar and thoracic spine was reported to showing degenerative changes. Cancer Low back pain 特徵 History and physical examination Age > 50 years or Failure of treatment or weight loss History of Previous cancer ESR,spine Films, 9% with cancer No significant finding ESR ESR < 20 and only one clinical Finding No cancer ESR > 20 or more than one clinical finding X-ray 2.3% cancer What was done… What happened… And why Human:1382 Pain:5735 Ulcer:1945 locus locus attends reason locus reason attends finding attends Breast:1492 Clinic:4096 reason plans Clinic:1024 plans plans reason locus Biopsy:1066 target Radio:1812 finding time reason plans Chemo:6502 treats reason Mass:1666 Clinic:2010 plans treats locus time Cancer:1914 time time time time time time Concept Lattice Given the context (D1,T1) where D1 = {d1,d2,d3,d4} & T1 = {t1,t2,t3,t4,t5,t6} Hasse Diagram C1:(D1,Ø) R t1 t2 t3 t4 t5 t6 d1 1 0 1 0 1 1 C2:({d1,d2,d4},{t1,t6}) C3:({d3,d4},{t4}) d2 1 0 1 0 1 1 d3 0 1 0 1 0 0 d4 1 0 0 1 0 1 C5:({d4},{t1,t4,t6}) C4:({d1,d2},{t1,t3,t5,t6}) C6:({d3},{t2,t4}) Table: The input relation R = documents keywords C7:(Ø, T1) The formal concept C4 has two own terms {t3,t5} and two inherited terms {t1,t6} Text Analysis Spectrum Classification Concept Identification Targeted Facts and Events Entity Extraction Clustering What is this document about? Who did what to whom when where, etc. Why is getting dimensional data so hard? Hank bought plastic explosives from Henry in Tucson yesterday. Named Entity Extraction Hank People, Weapons, Vehicles, Dates Henry NER Engine Plastic explosives 11/01/07 Tucson Automatic PatternLearning Systems Language Input Trainer Answers Model Pros: Portable across domains Tend to have broad coverage Language Input Decoder Answers Robust in the face of degraded input. Automatically find appropriate statistical patterns System knowledge not needed by those who supply the domain knowledge. Cons: Annotated training data, and lots of it, is needed. Isn’t necessarily better or cheaper than hand-built sol’n Examples: Riloff et al., AutoSlog, Soderland WHISK (UMass); Mooney et al. Rapier (UTexas); Ciravegna (Sheffield) Learn lexico-syntactic patterns from templates Explicit Events, Object Identity, Symmetry E52 Time-Span E39 Actor E53 Place 7012124 February 1945 P82 at some time within E7 Activity E39 Actor “Crimea Conference” E38 Image P86 falls within E65 Creation Event * E39 Actor P81 ongoing throughout E52 Time-Span 11-2-1945 E31 Document “Yalta Agreement” Rules Extraction The formal concept C4 makes it possible the following rules R1 : t3 t1 t6 R2 : t5 t1 t6 R3 : t3 t5 The interpretation of the R1 and R2: The use of terms t3 or t5 is always associated with that of terms t1 and t6 The rule R3 express mutual equivalence of the terms {t3,t5}: All the documents which have the term t3 also have the t5 term. 災後重建 基金 因果圖 -- 失依兒童 所在各縣市失 依兒童狀態 各縣市政 府,社會 局等介入 各縣市福利, 信託基金的 成立 中低收入 戶補助 對單親家庭 的補助之災 後重建及經 費相關使用 規則 中文 NER – Example 2 黑色當道 少了尖叫 女星太規矩 城城活跳跳 金馬獎星光大道不若前晚金鐘獎 「峰芒」畢露,女星們規矩平穩的服裝,讓星光大道上少了一些特色,並未出 現讓人眼睛一亮的驚喜。其中,在金鐘獎上讓人血脈僨張的蕭淑慎,在金馬獎 上可以看出服裝「規矩」了些。總體來說,今年的星光大道造型略顯平庸。 秋冬主流黑色更在金馬星光大道上大量出現,凱渥模特兒公司老闆、也是專業 資深時尚人洪偉明說:「可以發現他們選擇合適的服裝,規矩、正式的選擇, 可避免遭受批評,今年確實少了些特色,但重要的國際場合,平穩的黑色服裝, 也是出席正式場合的安全造型。」 洪偉明表示:「楊千嬅的服裝和她的人很 搭,黑色蕾絲讓她不至於顏色過重,正式中又帶點活潑,感覺很棒。」台中市 長胡志強女兒胡婷婷桃紅色的緞面禮服,也讓洪偉明很欣賞,他說:「整體感 覺落落大方,亮色服裝和她的人也很適合,她的自信和星光大道主持人蔣怡的 乾淨大方一樣,讓人感覺舒服,也是不錯的造型。」 舒淇鵝黃色的禮服,洪 偉明笑說:「羅曼蒂克的感覺和她的笑容很搭配,讓氣色宛如戀愛中的女人一 樣美好。」梁詠琪的黑色短禮服,雖然露出她的修長美腿,但洪偉明也建議: 「她至少可以搭雙絲襪,整體感覺會更好。她在演唱會上展現性感,其實星光 大道上也可以大膽改變。」 至於男星們的服裝,今年則是絲絨的天下,洪偉 明笑說:「男星們服裝不易做出變化,敢大膽嘗試不同造型的人也不多,其中 郭富城神采奕奕的精神,十分突出,張震的服裝則顯得穩重而規矩。」 專有名詞 詞 詞類 出現次數 舒淇 [Nb]專有名稱 2 張震 [Nb]專有名稱 1 高達 [Nb]專有名稱 1 賴雅妍 [Nb]專有名稱 1 白 [Nb]專有名稱 1 米蘭 [Nb]專有名稱 1 竹幼婷戴榮賢 [Nb]專有名稱 1 林熙蕾 [Nb]專有名稱 2 郭富城 [Nb]專有名稱 1 楊貴媚 [Nb]專有名稱 1 范文芳 [Nb]專有名稱 1 林志玲 [Nb]專有名稱 1 金馬獎 [Nb]專有名稱 3 楊采妮 [Nb]專有名稱 1 舒淇鵝 [Nb]專有名稱 1 藍正龍 [Nb]專有名稱 1 金城武 [Nb]專有名稱 2 侯佩岑 [Nb]專有名稱 3 蕭淑慎 [Nb]專有名稱 4 梁詠琪 [Nb]專有名稱 2 黃志瑋 [Nb]專有名稱 1 黃子佼 [Nb]專有名稱 1 天心 [Nb]專有名稱 1 楊千嬅 [Nb]專有名稱 1 洪偉明 [Nb]專有名稱 2 胡婷婷 [Nb]專有名稱 2 師李 [Nb]專有名稱 1 戴起 [Nb]專有名稱 1 出現次 數 詞 詞類 背後 [Nc]地方詞 1 中途 [Nc]地方詞 1 世界 [Nc]地方詞 1 天下 [Nc]地方詞 1 原地 [Nc]地方詞 1 時間 詞 詞類 詞 出現次數 詞類 出現次數 昨天 [Nd]時間詞 4 露美腿 [LN]人名類 2 新春 [Nd]時間詞 1 昨晚 [Nd]時間詞 1 [LN]人名類 1 早春 台中市長胡志強 女兒胡婷婷桃紅 色 [Nd]時間詞 1 前晚 [Nd]時間詞 2 先後 [Nd]時間詞 1 今年 [Nd]時間詞 6 週末 [Nd]時間詞 1 Generative Discriminative Generalize Object: attribute 貸款 Object: Attribute (condition) 震災重建暫行條例 受災戶 method 重建家園專案 object 災戶 Object: attribute 金融機構 利息 Object: attribute Object: attribute 房屋 Object: attribute Specify 損毀 Object: condition 範例 很適合用機洗 香味好聞 去污力強 洗衣省力 氣味清香 能去除99種污漬 洗得特別乾淨 香味好聞 白襪子洗得最乾淨 氣味很香 不傷手 能夠很好的去除污漬 衣服不易褪色 洗衣不費力 能去除99種污漬 用量少 洗得乾淨 對皮膚刺激少 洗各種污漬都很乾淨 洗得乾淨 價格適當 洗衣服的效果較好 氣味不錯 一直使用該品牌 洗好的衣物更白 氣味好聞 廣告印象深 洗得乾淨 易漂清 不太傷手 洗得乾淨 用量少 洗得乾淨 用量比別的牌子少 廣告大 洗得乾淨 用量少 質量好 用量少 洗得乾淨 包裝好 廣告多,吸引人 香味好聞 洗的乾淨、白 宣傳好,廣告有趣 很多人都說好 80 81 語意概念萃取 for Malignancy DSS Patient (Patient ID) ESR Screening (Positive) Symptom (Positive Indication) Cancer Bag of “Words” extraction Expressions extraction Decision Making Patient ID Named Entities malignancy ESR extraction Treatment severe Patient ID Events/Sentiment ESR back Extraction severe back pain pain x-ray x-ray lumbar Patient ID Diagnostic term lumbar spine malignancy? spine degenerative changes ESR screening test degenerative Lumber, Spine Anatomy Term Combined changes degenerative changes Symptom With structured data Information Retrieval Information Extraction Knowledge Inference (文件)資料探勘走向決策支援 彙整同一性質資料 Clustering 資料探勘以產生關聯相依規律 Association Rules 視覺化顯示協助專家研判主題 Visualization 定義處理指引方便建立決策支援 Processing Guideline 發展 Local data FTP Gopher HTML More structure Indexing Search Relevance Ranking Latent Semantic Topology Crawling WebSQL Social Network of Hyperlinks WebL XML Clustering Collaborative Filtering ScatterGather Topic Directories Semi-supervised Automatic Learning Classification Web Communities Web Servers Topic Distillation Focused Crawling Monitor Mine Modify User Profiling Web Browsers