Home » Server Options » Text & interMedia » Find Words in a String (11g Release 2 Windows)
Find Words in a String [message #618415] |
Fri, 11 July 2014 10:15 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
Anyone have a query that can find WORDS in a String and break out those words?
User types in "George Washington" into an input field and that string is passed to an Oracle procedure where the words are split out.
It's also possible that a user could type in "George Washington DC" so it's not always limited to just 2 words.
Input_String = 'George Washington';
procedure XYZ (Input_String in varchar2,
Results out sys_refcursor)
begin
open Results for
select column1,
column2,
column3,
.
.
column20
from presidents_table
where first_name = WORD1 or
last_name = WORD2
end
|
|
|
Re: Find Words in a String [message #618416 is a reply to message #618415] |
Fri, 11 July 2014 10:22 |
|
Michel Cadot
Messages: 68675 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
SQL> with
2 data as (
3 select 1 id, 'George Washington' data from dual
4 union all
5 select 2, 'George Washington DC' from dual
6 )
7 select id, to_number(column_value) word_nb,
8 regexp_substr(data, '[^ ]+', 1, column_value) word
9 from data,
10 table(cast(multiset(select level from dual connect by level<=regexp_count(data,' ')+1)
11 as sys.odcivarchar2list))
12 order by 1, 2
13 /
ID WORD_NB WORD
---------- ---------- ----------
1 1 George
1 2 Washington
2 1 George
2 2 Washington
2 3 DC
|
|
|
Re: Find Words in a String [message #618417 is a reply to message #618416] |
Fri, 11 July 2014 10:42 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
Seems like requirement is to finally search the president table with any of the passed names, first,or last etc... In that case, the WHERE clause in OP's query cannot be static. It will dynamically change.
@OP, can you confirm if that's the final requirement or Michel's solution suffices your requirement?
|
|
|
Re: Find Words in a String [message #618420 is a reply to message #618417] |
Fri, 11 July 2014 11:00 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
I'll have to digest what Michel is doing but I believe what he has posted will work. I'm going to make the assumption that the user will start off with a last name and then move to a first name and then maybe something else. My query will just need to take those values and go against the correct columns when querying the table. I still need to think this completely through but I wanted to see if someone had a query to at least break the words up.
|
|
|
Re: Find Words in a String [message #618421 is a reply to message #618415] |
Fri, 11 July 2014 11:21 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Use a Context index?
orclz>
orclz> create index sh.ctxi on sh.customers(cust_street_address)
2 indextype is ctxsys.context;
Index created.
orclz>
orclz> select cust_street_address from customers
2 where contains(cust_street_address,'Oxford')>0 and rownum < 10;
CUST_STREET_ADDRESS
----------------------------------------
47 North Oxford Road
17 West Oxford Court
77 South Oxford Avenue
67 East Oxford Avenue
37 Oxford Street
97 West Oxford Avenue
117 Oxford Avenue
67 West Oxford Avenue
17 North Oxford Court
9 rows selected.
orclz>
|
|
|
Re: Find Words in a String [message #618426 is a reply to message #618421] |
Fri, 11 July 2014 11:33 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
That's using Oracle Text, correct? I've been looking at that because what I'm tying to do is an autocomplete that is fast. Meaning, the user types in "Washington" and then maybe "Washington G" and I quickly bring back a result set that has those names. Same thing when searching for courses like "Dentistry Tee", "Dentistry Teeth" , "History George Washington" or just "Hist America". The search has to be fast and contain words that the user is typing in.
|
|
|
temp topic to be merged don't reply in it. [message #618513 is a reply to message #618415] |
Sat, 12 July 2014 10:47 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
When using the CTXSYS.CONTEXT index does it matter how the words are arranged when getting results?
CREATE TABLE SEARCH
(
SRCH VARCHAR2(500 BYTE)
)
CREATE INDEX SEARCH_CTXI ON SEARCH
(SRCH)
INDEXTYPE IS CTXSYS.CONTEXT
NOPARALLEL;
SRCH Column
WS2000 LIFE-SCI 499G 1 11965 UNDERGRAD RSCH-CELL BIOL
WS2000 MEDICINE 340R 1 11991 HEMATOLOGY/ONCOLOGY I
WS2000 MEDICINE 471R 1 11981 CARD FELLOWSHIP I
MEDICINE WS2000 472R 1 11982 CARD FELLOWSHIP II
So, using a CONTAINS with 'MEDICINE' would result in 3 rows. 'WS2000' would pull up 4 rows. 'WS2000 MEDICINE' doesn't pull up any rows?
|
|
|
|
|
|
|
|
Re: Find Words in a String [message #618524 is a reply to message #618522] |
Sat, 12 July 2014 13:13 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
@Michel, I reviewed both the topics, and it seems that you have perfectly taken care of topic split and merge of comments. A complementing comment in each topic is left which makes sense to the previous comments in each topic respectively, so all is well.
Thank you Michel for the effort.
|
|
|
Re: Find Words in a String [message #618527 is a reply to message #618421] |
Sat, 12 July 2014 13:26 |
Solomon Yakobson
Messages: 3280 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
Just one note - by default context index is not updated along with table inserts:
SQL> select * from tbl;
W
------------------------------
ox ford
ox-ford
oxford
SQL> insert into tbl values('Oxford Town Hall');
1 row created.
SQL> select w from tbl where contains(w,'Oxford') > 0;
W
------------------------------
oxford
SQL> commit;
Commit complete.
SQL> select w from tbl where contains(w,'Oxford') > 0;
W
------------------------------
oxford
SQL> delete tbl where w = 'Oxford Town Hall';
1 row deleted.
SQL> commit;
Commit complete.
SQL> drop index tbl_idx1;
Index dropped.
SQL> create index tbl_idx1 on tbl(w) indextype is ctxsys.context parameters('transactional');
Index created.
SQL> select * from tbl;
W
------------------------------
ox ford
ox-ford
oxford
SQL> insert into tbl values('Oxford Town Hall');
1 row created.
SQL> select w from tbl where contains(w,'Oxford') > 0;
W
------------------------------
oxford
Oxford Town Hall
SQL>
SY.
|
|
|
Re: Find Words in a String [message #618529 is a reply to message #618527] |
Sat, 12 July 2014 13:58 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
CREATE TABLE BLACKBOARD_COURSE_SEARCH
(
SRCH VARCHAR2(500 BYTE)
)
CREATE INDEX BK_COURSE_SEARCH_CTXI ON BLACKBOARD_COURSE_SEARCH
(SRCH)
INDEXTYPE IS CTXSYS.CONTEXT
NOPARALLEL;
SET DEFINE OFF;
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 LIFE-SCI 499G 1 11965 UNDERGRAD RSCH-CELL BIOL');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 340R 1 11991 HEMATOLOGY/ONCOLOGY I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 EDUC 575 1 11963 INTERNSHIP');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 FRN-LNG 899 1 11953 REQUIRED GRAD ENROLLMENT');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 OR-SURG 705 1 11989 CLIN ORAL SURGERY II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 375R 1 11990 RADIOLOGY II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 334R 1 11986 FAMILY MED/OFF PRAC I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 559R 1 11987 OB-GYN OFF PRAC I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 OR-SURG 702 1 11988 PRIN ORAL SURGERY III');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 ACCORD 102 1 11976 FRESHMAN ACCORDION II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 CIV-ENGR 274 1 11961 CVL ENGNRNG SYSTEMS DSGN');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 CIV-ENGR 300 1 11962 PROBLEMS');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 OR-SURG 720 1 11985 ORAL SURG HOSP RES I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 FRENCH 290 1 11975 SP INTER FRENCH TOPS II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 350R 1 11960 NEPHROLOGY I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 OR-SURG 701 1 11984 PRIN ORAL SURGERY II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 ENGR 390 1 11973 ENGR COOP/INTERNSHIP');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MC-ENGR 301A 1 11974 TPCS MECH & AEROSPC ENGR');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 TROMB 601 1 11958 GRAD TROMB-DOCTRL PERF');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 CELLO 100C 1 11994 APPLD STDY NON-MUSIC MAJ');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 471R 1 11981 CARD FELLOWSHIP I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 472R 1 11982 CARD FELLOWSHIP II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 473R 1 11983 CARD FELLOWSHIP III');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 452RR 1 11959 PED HEMATOLOGY/ONCOLOGY');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 EDUC-UL 522 1 11957 SCH ORG CUL CON CHANGE');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 PUB-ADM 581 1 11993 SEM URBAN ADMINISTRATION');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 393R 1 11979 MEDICAL INTENSIVE CARE I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 PHYSICS 490 1 11971 SPECIAL PROBLEMS');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 EL-ENGR 300K 1 11972 PROBLEMS IN ECE');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 320R 1 11980 ALLERGY/IMMUNOLOGY I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 THEATER 532A 1 11955 PROF COSTUME DESIGN');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 EL-ENGR 490 1 11956 RESEARCH');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 115R 1 11978 MEDICAL TERMINOLOGY');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 SPANISH H211 1 11969 HONORS:SECOND YR SPAN I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 PSYCH 450SA 1 11970 SPECIAL TOPICS');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 A&S 402 1 11968 SENIOR HONORS COLLOQUIUM');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 STR-BASS 102 1 12000 FRESHMAN STRING BASS II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 EDUC 564 1 11997 IS: HIST HIGHER ED');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 593R 1 11995 STUDIES ON CHEM DEPENDNC');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 BMS 413 1 11996 MEDICAL MICROBIOLOGY');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 NURSE 899 1 12002 REQUIRED GRAD ENROLLMENT');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 442R 1 11998 PEDIATRIC OFF PRAC I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 SAXOPH 202 1 11999 SOPHOMORE SAXOPHONE II');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 GEOG 503 1 12004 HISTRY & PHILSY GEOGRPHY');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 601 1 12003 INT MED/DOC INST YR 6');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 CJC 599 1 12001 RESEARCH AND THESIS');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 LIFE-SCI 497L 1 11966 SPECIAL TPCS-BIOL SCIENC');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 A&S 400G 1 11967 SPECIAL READINGS/TOPICS');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 332R 1 11964 ENDOCRINOLOGY I');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 RELIG-ST 697RS 1 11954 DOCTRL LVL INDEP READING');
Insert into BLACKBOARD_COURSE_SEARCH
(SRCH)
Values
('WS2000 MEDICINE 360R 1 11992 PATHOLOGY/SURGICAL I');
COMMIT;
select *
from blackboard_course_search
where contains(srch, SrchWords) > 0 and
rownum < 50
If I try 'WS2000 11968' then I don't get any results.
SrchWords = 'WS2000 11968';
Can you get it to work?
|
|
|
Re: Find Words in a String [message #618535 is a reply to message #618529] |
Sat, 12 July 2014 14:30 |
Solomon Yakobson
Messages: 3280 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
1. You created non-transactional context index before inserting rows. So index is empty. You need to create index as:
CREATE INDEX BK_COURSE_SEARCH_CTXI ON BLACKBOARD_COURSE_SEARCH
(SRCH)
INDEXTYPE IS CTXSYS.CONTEXT
NOPARALLEL
PARAMETERS('TRANSACTIONAL');
2. SrchWords = 'WS2000 11968' is two words, so you need to tell CONTAINS are you looking for both or any of them. Both of them:
select *
from blackboard_course_search
where contains(srch,'WS2000 & 11968') > 0
/
SRCH
-------------------------------------------------
WS2000 A&S 402 1 11968 SENIOR HONORS COLLOQUIUM
SQL>
Any of them:
select *
from blackboard_course_search
where contains(srch,'WS2000 | 11968') > 0
/
SY.
|
|
|
|
Re: Find Words in a String [message #618545 is a reply to message #618535] |
Sat, 12 July 2014 16:04 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
I believe I already know the answer but I thought I would confirm.
You must do 'WS2000 & 11968' or 'WS2000 | 11968' between the words and NOT JUST 'WS2000 11968'? Is there any way around not doing "&" or "|" and getting the same results?
|
|
|
Re: Find Words in a String [message #618547 is a reply to message #618545] |
Sat, 12 July 2014 16:52 |
Solomon Yakobson
Messages: 3280 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
No, but assuming SrchWords is one or more spcae separated words:
select *
from blackboard_course_search
where contains(srch,replace(SrchWords,' ','&')) > 0
/
For example:
SQL> select *
2 from blackboard_course_search
3 where contains(srch,replace('WS2000 11968',' ','&')) > 0
4 /
SRCH
------------------------------------------------------------------
WS2000 A&S 402 1 11968 SENIOR HONORS COLLOQUIUM
SQL>
SY.
|
|
|
|
Re: Find Words in a String [message #618549 is a reply to message #618426] |
Sat, 12 July 2014 23:40 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Duane wrote on Fri, 11 July 2014 09:33That's using Oracle Text, correct? I've been looking at that because what I'm tying to do is an autocomplete that is fast. Meaning, the user types in "Washington" and then maybe "Washington G" and I quickly bring back a result set that has those names. Same thing when searching for courses like "Dentistry Tee", "Dentistry Teeth" , "History George Washington" or just "Hist America". The search has to be fast and contain words that the user is typing in.
Oracle Text has a lot of options. The following example demonstrates a few of them. I have used a bind variable for the search terms for maximum efficiency. I have then modified the search terms within the query to allow for similarity in spelling and ending wildcard for each term. I have also used the accum operator to affect the scoring. I have then used that text scoring and edit_distance_similarity to order the results. It is important to note that you must order the results within an inner sub-query, then select the number of rows from an outer sub-query; Otherwise, you just get the first 50 random rows that match the criteria, not the closest matches in order. I have included columns for ranking, text score, and edit_distance_similarity, so that you can see how the scoring affects the order, but you could use any one or combination of them that you like. The accum operator makes it so that the more separate terms that match, the higher the score. The following demonstrates the type of searches that a user might enter, gradually refining the search, changing the order of the results, similar to the auto complete that you have asked for. This is just a small sampling of some of the features available in Oracle Text.
SCOTT@orcl12c> VARIABLE SrchWords VARCHAR2(100)
SCOTT@orcl12c> EXEC :SrchWords := 'Washington'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> SELECT *
2 FROM (SELECT DENSE_RANK () OVER
3 (ORDER BY SCORE(1) DESC,
4 UTL_MATCH.EDIT_DISTANCE_SIMILARITY (srch, :SrchWords) DESC)
5 AS ranking,
6 SCORE(1) text_score,
7 UTL_MATCH.EDIT_DISTANCE_SIMILARITY (srch, :SrchWords) eds,
8 srch
9 FROM blackboard_course_search
10 WHERE CONTAINS
11 (srch,
12 '?' || REPLACE (:SrchWords, ' ', '% ACCUM ?') || '%',
13 1) > 0
14 ORDER BY ranking)
15 WHERE ROWNUM < 50
16 /
RANKING TEXT_SCORE EDS SRCH
---------- ---------- ---------- -------------------------------------------
1 3 100 Washington
2 3 59 George Washington
3 3 56 General Washington
4 3 50 Booker T. Washington
5 3 40 General George Washington
6 3 38 George Washington's history
7 3 32 The History of George Washington
8 3 24 The History of George Washington in America
8 rows selected.
SCOTT@orcl12c> EXEC :Srchwords := 'Washington G'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING TEXT_SCORE EDS SRCH
---------- ---------- ---------- -------------------------------------------
1 52 48 George Washington
2 52 45 General Washington
3 52 41 George Washington's history
4 52 32 General George Washington
5 52 26 The History of George Washington in America
6 52 25 The History of George Washington
7 2 84 Washington
8 2 40 Booker T. Washington
8 rows selected.
SCOTT@orcl12c> EXEC :Srchwords := 'George Washington'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING TEXT_SCORE EDS SRCH
---------- ---------- ---------- -------------------------------------------
1 52 100 George Washington
2 52 68 General George Washington
3 52 63 George Washington's history
4 52 54 The History of George Washington
5 52 40 The History of George Washington in America
6 2 78 General Washington
7 2 65 Booker T. Washington
8 2 59 Washington
8 rows selected.
SCOTT@orcl12c> EXEC :Srchwords := 'History George Washington'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING TEXT_SCORE EDS SRCH
---------- ---------- ---------- -------------------------------------------
1 68 79 The History of George Washington
2 68 59 The History of George Washington in America
3 68 34 George Washington's history
4 34 72 General George Washington
5 34 68 George Washington
6 1 52 General Washington
6 1 52 Booker T. Washington
7 1 40 Washington
8 1 28 The History of America
9 1 16 American History
10 rows selected.
SCOTT@orcl12c> EXEC :Srchwords := 'History George Washington America'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING TEXT_SCORE EDS SRCH
---------- ---------- ---------- -------------------------------------------
1 76 77 The History of George Washington in America
2 51 55 The History of George Washington
3 51 49 George Washington's history
4 26 55 General George Washington
5 26 52 George Washington
6 26 40 The History of America
7 26 19 American History
8 1 40 Booker T. Washington
8 1 40 General Washington
9 1 31 Washington
10 rows selected.
|
|
|
|
Re: Find Words in a String [message #619353 is a reply to message #618549] |
Mon, 21 July 2014 16:12 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
What are the question marks doing? I'm just trying to understand and learn.
CONTAINS
(srch,
'?' || REPLACE (:SrchWords, ' ', '% ACCUM ?') || '%',
1) > 0
|
|
|
|
Re: Find Words in a String [message #619750 is a reply to message #618549] |
Thu, 24 July 2014 16:12 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
Barbara,
Do you have any suggestions or recommendations on how I might handle a situation where a user might want to search on a course and instructor? What I have now is they select a "by Course Info" or "by Instructor" button and they are either searching the "course_search" or "instructor_search" column using the query you provided. That works great but I know what's coming and I'm sure they will ask for a way to search by course and instructor.
With that in mind. Would it be feasible to include everything in one column? (e.g. SP2008 TUBA 402 0001 16953 SENIOR TUBA II INSTRUCTOR1 INSTRUCTOR2 INSTRUCTOR3... SP2008 TRUMPET 101 0001 16923 FRESHMAN TRUMPET I INSTRUCTOR1) Or is there a better way to keep the two columns, "course_search" and "instructor_search", and somehow break up the entered text and go against the correct column?
Words = "SS2007 MATH CROSS"
Course = "SS2007 MATH"
Instructor = "CROSS"
Like:
where contains(course_search, '%'||replace(Words, ' ', '% ACCUM ?')||'%', 1) > 0 or
contains(instructor_search, '%'||replace(Words, ' ', '% ACCUM ?')||'%', 1) > 0
course_search instructor_search
SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING WASHINGTON GEORGE
SP2008 COMM-ST 314WI 0001 16568 REPORTING BUSH GEORGE
SP2008 COMM-ST 317 0001 16569 PERSUASION POE R
|
|
|
Re: Find Words in a String [message #619751 is a reply to message #619750] |
Thu, 24 July 2014 17:18 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
I would maintain two separate columns in the table and two separate search options and allow them to use either one or both. I would use a multi_column_datastore and section group with sections to enable searching within either one or both sections using only one text domain index hit for maxiimum efficiency. Please see the demonstration below.
-- table:
SCOTT@orcl12c> CREATE TABLE blackboard_course_search
2 (course_search VARCHAR2(60),
3 instructor_search VARCHAR2(20),
4 srch VARCHAR2(1))
5 /
Table created.
-- multi_column_datastore:
SCOTT@orcl12c> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('test_ds', 'MULTI_COLUMN_DATASTORE');
3 CTX_DDL.SET_ATTRIBUTE ('test_ds', 'COLUMNS', 'course_search, instructor_search');
4 END;
5 /
PL/SQL procedure successfully completed.
-- transactional index using multi_column_datastore and auto_section_group:
SCOTT@orcl12c> CREATE INDEX bk_course_search_ctxi
2 ON blackboard_course_search (srch)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 PARAMETERS
5 ('TRANSACTIONAL
6 DATASTORE test_ds
7 SECTION GROUP CTXSYS.AUTO_SECTION_GROUP')
8 /
Index created.
-- data:
SCOTT@orcl12c> SET DEFINE OFF
SCOTT@orcl12c> INSERT ALL
2 INTO blackboard_course_search VALUES
3 ('SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING', 'WASHINGTON GEORGE', NULL)
4 INTO blackboard_course_search VALUES
5 ('SP2008 COMM-ST 314WI 0001 16568 REPORTING', 'BUSH GEORGE', NULL)
6 INTO blackboard_course_search VALUES
7 ('SP2008 COMM-ST 317 0001 16569 PERSUASION', 'POE R', NULL)
8 INTO blackboard_course_search VALUES
9 ('SS2007 MATH', 'CROSS', NULL)
10 INTO blackboard_course_search VALUES
11 ('SS2007 MATH', 'SOMEBODY', NULL)
12 INTO blackboard_course_search VALUES
13 ('WHATEVER', 'CROSS', NULL)
14 SELECT * FROM DUAL
15 /
6 rows created.
SCOTT@orcl12c> COMMIT
2 /
Commit complete.
-- variables for search:
SCOTT@orcl12c> VARIABLE by_course_info VARCHAR2(100)
SCOTT@orcl12c> VARIABLE by_instructor VARCHAR2(100)
-- search by both course info and instructor info:
SCOTT@orcl12c> EXEC :by_course_info := 'SS2007 MATH'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> EXEC :by_instructor := 'CROSS'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> COLUMN course_search FORMAT A20
SCOTT@orcl12c> COLUMN instructor_search FORMAT A20
SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
SCOTT@orcl12c> SELECT *
2 FROM (SELECT DENSE_RANK () OVER
3 (ORDER BY SCORE(1) DESC,
4 UTL_MATCH.EDIT_DISTANCE_SIMILARITY
5 (course_search, :by_course_info) DESC,
6 UTL_MATCH.EDIT_DISTANCE_SIMILARITY
7 (instructor_search, :by_instructor) DESC)
8 AS ranking,
9 SCORE(1) text_score,
10 UTL_MATCH.EDIT_DISTANCE_SIMILARITY (course_search, :by_course_info) csim,
11 UTL_MATCH.EDIT_DISTANCE_SIMILARITY (instructor_search, :by_instructor) isim,
12 course_search,
13 instructor_search
14 FROM blackboard_course_search
15 WHERE CONTAINS
16 (srch,
17 RTRIM
18 (NVL2
19 (:by_course_info,
20 '?' || REPLACE (:by_course_info, ' ', '% WITHIN course_search ACCUM ?')
21 || '% within course_search ACCUM ',
22 NULL) ||
23 NVL2
24 (:by_instructor,
25 '?' || REPLACE (:by_instructor, ' ', '% WTHIN instructor_search ACCUM ?')
26 || '% WITHIN instructor_search',
27 NULL),
28 ' ACCUM '),
29 1) > 0
30 ORDER BY ranking)
31 WHERE ROWNUM < 50
32 /
RANKING TEXT_SCORE CSIM ISIM COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ---------- ---------- ---------- -------------------- --------------------
1 68 100 100 SS2007 MATH CROSS
2 34 100 13 SS2007 MATH SOMEBODY
3 1 0 100 WHATEVER CROSS
3 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1244607283
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 96 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 96 | 4 (0)| 00:00:01 |
| 3 | WINDOW SORT | | 1 | 58 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| BLACKBOARD_COURSE_SEARCH | 1 | 58 | 4 (0)| 00:00:01 |
|* 5 | DOMAIN INDEX | BK_COURSE_SEARCH_CTXI | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<50)
5 - access("CTXSYS"."CONTAINS"("SRCH",RTRIM(NVL2(:BY_COURSE_INFO,'?'||REPLACE(:BY_COURSE_INFO,'
','% WITHIN course_search ACCUM ?')||'% within course_search ACCUM
',NULL)||NVL2(:BY_INSTRUCTOR,'?'||REPLACE(:BY_INSTRUCTOR,' ','% WTHIN instructor_search ACCUM
?')||'% WITHIN instructor_search',NULL),' ACCUM '),1)>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SCOTT@orcl12c> SET AUTOTRACE OFF
-- search by course info only;
SCOTT@orcl12c> EXEC :by_course_info := 'SS2007 MATH'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> EXEC :by_instructor := NULL
PL/SQL procedure successfully completed.
SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
SCOTT@orcl12c> /
RANKING TEXT_SCORE CSIM ISIM COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ---------- ---------- ---------- -------------------- --------------------
1 52 100 0 SS2007 MATH CROSS
1 52 100 0 SS2007 MATH SOMEBODY
2 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1244607283
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 96 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 96 | 4 (0)| 00:00:01 |
| 3 | WINDOW SORT | | 1 | 58 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| BLACKBOARD_COURSE_SEARCH | 1 | 58 | 4 (0)| 00:00:01 |
|* 5 | DOMAIN INDEX | BK_COURSE_SEARCH_CTXI | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<50)
5 - access("CTXSYS"."CONTAINS"("SRCH",RTRIM(NVL2(:BY_COURSE_INFO,'?'||REPLACE(:BY_COURSE_INFO,'
','% WITHIN course_search ACCUM ?')||'% within course_search ACCUM
',NULL)||NVL2(:BY_INSTRUCTOR,'?'||REPLACE(:BY_INSTRUCTOR,' ','% WTHIN instructor_search ACCUM
?')||'% WITHIN instructor_search',NULL),' ACCUM '),1)>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SCOTT@orcl12c> SET AUTOTRACE OFF
-- search by instructor only:
SCOTT@orcl12c> EXEC :by_course_info := NULL
PL/SQL procedure successfully completed.
SCOTT@orcl12c> EXEC :by_instructor := 'CROSS'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
SCOTT@orcl12c> /
RANKING TEXT_SCORE CSIM ISIM COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ---------- ---------- ---------- -------------------- --------------------
1 3 0 100 SS2007 MATH CROSS
1 3 0 100 WHATEVER CROSS
2 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1244607283
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 96 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 96 | 4 (0)| 00:00:01 |
| 3 | WINDOW SORT | | 1 | 58 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| BLACKBOARD_COURSE_SEARCH | 1 | 58 | 4 (0)| 00:00:01 |
|* 5 | DOMAIN INDEX | BK_COURSE_SEARCH_CTXI | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<50)
5 - access("CTXSYS"."CONTAINS"("SRCH",RTRIM(NVL2(:BY_COURSE_INFO,'?'||REPLACE(:BY_COURSE_INFO,'
','% WITHIN course_search ACCUM ?')||'% within course_search ACCUM
',NULL)||NVL2(:BY_INSTRUCTOR,'?'||REPLACE(:BY_INSTRUCTOR,' ','% WTHIN instructor_search ACCUM
?')||'% WITHIN instructor_search',NULL),' ACCUM '),1)>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SCOTT@orcl12c>
|
|
|
|
Re: Find Words in a String [message #619856 is a reply to message #619751] |
Fri, 25 July 2014 09:43 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
Hi Barbara,
I misled you some on what would be entered. Right now, they can search by course or instructor by clicking a button for each option but I'm not sure how I would break up the text for each variable if "SS2007 MATH CROSS" was entered.
My example I gave you broke out the course and instructor (see below). I did this to show you what was the course information and what was the instructor information but the actual information may be something like "course information instructor name", "course information", "instructor information". So, basically, they want the ability to search 3 different ways and obtain the correct results.
I'm guessing I could still use your queries but maybe replace "by_course_info" and "by_instructor" with just "Words". I'll let you comment on whether that will work or not.
Example entry with parsed values from my post
Words = "SS2007 MATH CROSS"
Course = "SS2007 MATH"
Instructor = "CROSS"
__________________________________
How users might enter data
Words = "SS2007 MATH CROSS"
Words = "GEORGE WASHINGTON"
Words = "315 MATH WASHINGTON"
Words = "HEMATOLOGY/ONCOLOGY"
Words = "SS2007 HEMATOLOGY/ONCOLOGY BUSH G"
I hope I'm not confusing you more with this post. If you need additional information, let me know.
|
|
|
Re: Find Words in a String [message #619872 is a reply to message #619856] |
Fri, 25 July 2014 11:35 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
In that case, I would use the multi_column_datastore, but without sections, as shown below.
SCOTT@orcl12c> CREATE TABLE blackboard_course_search
2 (course_search VARCHAR2(60),
3 instructor_search VARCHAR2(20),
4 srch VARCHAR2(1))
5 /
Table created.
SCOTT@orcl12c> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('test_ds', 'MULTI_COLUMN_DATASTORE');
3 CTX_DDL.SET_ATTRIBUTE ('test_ds', 'COLUMNS', 'course_search, instructor_search');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> CREATE INDEX bk_course_search_ctxi
2 ON blackboard_course_search (srch)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 PARAMETERS
5 ('TRANSACTIONAL
6 DATASTORE test_ds')
7 /
Index created.
SCOTT@orcl12c> SET DEFINE OFF
SCOTT@orcl12c> INSERT ALL
2 INTO blackboard_course_search VALUES
3 ('SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING', 'WASHINGTON GEORGE', NULL)
4 INTO blackboard_course_search VALUES
5 ('SP2008 COMM-ST 314WI 0001 16568 REPORTING', 'BUSH GEORGE', NULL)
6 INTO blackboard_course_search VALUES
7 ('SP2008 COMM-ST 317 0001 16569 PERSUASION', 'POE R', NULL)
8 INTO blackboard_course_search VALUES
9 ('SS2007 MATH', 'CROSS', NULL)
10 INTO blackboard_course_search VALUES
11 ('SS2007 MATH', 'SOMEBODY', NULL)
12 INTO blackboard_course_search VALUES
13 ('WHATEVER', 'CROSS', NULL)
14 SELECT * FROM DUAL
15 /
6 rows created.
SCOTT@orcl12c> COMMIT
2 /
Commit complete.
SCOTT@orcl12c> VARIABLE words VARCHAR2(100)
SCOTT@orcl12c> COLUMN course_search FORMAT A60 WORD_WRAPPED
SCOTT@orcl12c> COLUMN instructor_search FORMAT A20 WORD_WRAPPED
SCOTT@orcl12c> EXEC :words := 'SS2007 MATH CROSS'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> SELECT *
2 FROM (SELECT DENSE_RANK () OVER
3 (ORDER BY SCORE(1) DESC,
4 GREATEST
5 (UTL_MATCH.EDIT_DISTANCE_SIMILARITY
6 (course_search, :words),
7 UTL_MATCH.EDIT_DISTANCE_SIMILARITY
8 (instructor_search, :words),
9 UTL_MATCH.EDIT_DISTANCE_SIMILARITY
10 (course_search || ' ' || instructor_search, :words),
11 UTL_MATCH.EDIT_DISTANCE_SIMILARITY
12 (instructor_search || ' ' || course_search, :words)) DESC)
13 AS ranking,
14 course_search,
15 instructor_search
16 FROM blackboard_course_search
17 WHERE CONTAINS
18 (srch,
19 '?' || REPLACE (:words, ' ', '% ACCUM ?') || '%',
20 1) > 0
21 ORDER BY ranking)
22 WHERE ROWNUM < 50
23 /
RANKING COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ------------------------------------------------------------ --------------------
1 SS2007 MATH CROSS
2 SS2007 MATH SOMEBODY
3 WHATEVER CROSS
3 rows selected.
SCOTT@orcl12c> EXEC :words := 'GEORGE WASHNGTON'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ------------------------------------------------------------ --------------------
1 SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING WASHINGTON GEORGE
2 SP2008 COMM-ST 314WI 0001 16568 REPORTING BUSH GEORGE
2 rows selected.
SCOTT@orcl12c> EXEC :words := '315 MATH WASHINGTON'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ------------------------------------------------------------ --------------------
1 SS2007 MATH SOMEBODY
2 SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING WASHINGTON GEORGE
3 SS2007 MATH CROSS
3 rows selected.
SCOTT@orcl12c> EXEC :words := 'HEMATOLOGY/ONCOLOGY'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
no rows selected
SCOTT@orcl12c> EXEC :words := 'SS2007 HEMATOOGY/ONCOLOGY BUSH G'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
RANKING COURSE_SEARCH INSTRUCTOR_SEARCH
---------- ------------------------------------------------------------ --------------------
1 SP2008 COMM-ST 314WI 0001 16568 REPORTING BUSH GEORGE
2 SS2007 MATH CROSS
2 SS2007 MATH SOMEBODY
3 SP2008 COMM-ST 312 0005 16565 ADVANCED PUBLIC SPEAKING WASHINGTON GEORGE
4 rows selected.
|
|
|
Re: Find Words in a String [message #619878 is a reply to message #619872] |
Fri, 25 July 2014 12:08 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
Hi Barbara,
This is wonderful. Thank you.
Quick question. The index is being applied to the "srch" column?
When I was reading the Oracle Text docs I came about the multi-column index but didn't know how I could make it work with what you had already shown me (i.e. using CONTAINS (course_search.....)). So "srch" clears up that problem.
I was also applying this to my "course_search" (renamed the column from my very first post on this subject) column. I was just trying to apply techniques from what I was reading about Oracle Text.
Would a WORDLIST help any in this case?
begin
ctx_ddl.create_preference ('text_search', 'BASIC_WORDLIST');
ctx_ddl.set_attribute ('text_search', 'substring_index', 'YES');
ctx_ddl.set_attribute ('text_search', 'prefix_index', 'YES');
ctx_ddl.set_attribute ('text_search', 'prefix_min_length', 1);
ctx_ddl.set_attribute ('text_search', 'prefix_max_length', 10);
end;
create index bk_course_search_ctxi on blackboard_course_search(course_search)
indextype is ctxsys.context
parameters('TRANSACTIONAL WORDLIST text_search')
noparallel;
I believe after this I won't bother you anymore.
|
|
|
Re: Find Words in a String [message #619897 is a reply to message #619878] |
Fri, 25 July 2014 15:20 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Quote:
...Quick question. The index is being applied to the "srch" column?...
The index can be created on any single text column. What data is indexed depends on what columns are in the multi_column_datastore that is used in the index parameters. Instead of creating the index on any one of those columns, which you could do, it is commonplace to use a separate dummy column, perhaps with a name with a broader meaning, instead of one specific column. Whatever column you create the index on is the one that you will need to search on, remembering that it is actually searching the columns in the multi_column_datastore.
Quote:
Would a WORDLIST help any in this case?
...
begin
ctx_ddl.create_preference ('text_search', 'BASIC_WORDLIST');
ctx_ddl.set_attribute ('text_search', 'substring_index', 'YES');
ctx_ddl.set_attribute ('text_search', 'prefix_index', 'YES');
ctx_ddl.set_attribute ('text_search', 'prefix_min_length', 1);
ctx_ddl.set_attribute ('text_search', 'prefix_max_length', 10);
end;
...
...
The purpose for a word list like the one you posted is for speeding up searches that use leading wildcards. If you are expecting wildcards, then you should also set your wildcard_maxterms appropriately.
|
|
|
Re: Find Words in a String [message #619901 is a reply to message #619897] |
Fri, 25 July 2014 15:59 |
Duane
Messages: 559 Registered: December 2002
|
Senior Member |
|
|
Gotcha!
That wasn't very clear in the docs or I just missed that part when I was reading. I'm also trying to work on the application and read the docs while working.
Ah, you're talking about something like "wom?n", "edit* content" or work% within the actual user's text and not the "?" or "%" within the CONTAINS statement. I was thinking you had to do that if the CONTAINS had those characters. Silly me.
CONTAINS (srch, '?' || REPLACE (:words, ' ', '% ACCUM ?') || '%', 1) > 0
Thank you for all these queries you have given me. All this helps and will help me build on these concepts.
|
|
|
Re: Find Words in a String [message #619909 is a reply to message #619901] |
Fri, 25 July 2014 19:46 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
The only valid wildcard in contains clause using a context index is %. You can have a leading wildcard, like %word or a trailing wildcard, like word% or both, like %word%. Whether you put wildcards in the words variable or in the contains clause they end up in the contains clause. If you have a contains clause like:
CONTAINS (srch, '?' || REPLACE (:words, ' ', '% ACCUM ?') || '%', 1) > 0
and the user enters the value word1 word2 word3 for the :words variable, then you end up with:
CONTAINS (srch, ?word1% ACCUM ?word2% ACCUM ?word3%', 1) > 0
It automatically adds searching for similarly spelled words and trailing wildcards (like auto complete) to every word and adds ACCUM between every word, so that it returns any rows with any of the terms, with the rows that have more of the individual terms scoring higher. It is up to you whether you want to do this automatically for your users or allow your users to enter such things on their own. It sounded like you wanted to make it a simple, google-like entry for your users and do the rest automatically, so that is why I suggested that.
If you are expecting that users may enter ? and * and such in invalid places in the words variable, then you will want to strip those out, perhaps nesting an additional replace:
CONTAINS (srch, '?' || REPLACE (REPLACE (REPLACE (:words, '?', ''), '*', ''), ' ', '% ACCUM ?') || '%', 1) > 0
|
|
|
Goto Forum:
Current Time: Sat Sep 07 19:38:13 CDT 2024
|