我在Oracle中有一个包含四列的表。
现在,用户可以向我的查询输入输入字符串作为“向右膝盖操作”(有效),并且我的查询应返回与DiagnosisName列中的大多数单词匹配的ICD代码(IKR123)。
以下是我当前的查询。(未提供正确的输出)
SELECT diagnosisname
FROM
(SELECT diagnosisname,
UTL_MATCH.jaro_winkler_similarity('%operation Knee right%',diagnosisname)
FROM icd_code
ORDER BY UTL_MATCH.EDIT_DISTANCE_SIMILARITY('%operation Knee right%',diagnosisname) DESC
)
WHERE ROWNUM<2;
该查询给我的输出为“左膝操作”,但我的期望是“右膝操作”。
关于您使用UTL_MATCH的注意事项:
ORDER BY UTL_MATCH.EDIT_DISTANCE_SIMILARITY('%operation Knee right%',diagnosisname)DESC
这不会给您正确的结果。因为,您仅考虑可能的相似性,但是没有考虑数据输入错误。因此,您必须使用JARO_WINKLER_SIMILARITY。
右膝手术
您需要牢记输入的CASE和要比较的列值。为了正确匹配,它们必须处于类似情况。您正在LOWERCASE中传递输入,但是,列值在INITCAP中。最好将列值和输入都转换为相似的情况。
让我们看下面的演示以了解:
SQL> WITH DATA AS(
2 SELECT 'Heart Operation' diagnosis_name, 'IH123' icd_code FROM dual UNION ALL
3 SELECT 'Knee Operation' diagnosis_name, 'IK123' icd_code FROM dual UNION ALL
4 SELECT 'Left Knee Operation' diagnosis_name, 'IKL123' icd_code FROM dual UNION ALL
5 SELECT 'Right Knee Operation' diagnosis_name, 'IKR123' icd_code FROM dual UNION ALL
6 SELECT 'Fever' diagnosis_name, 'IF123' icd_code FROM dual
7 )
8 SELECT t.*,
9 utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation Knee right')) eds,
10 UTL_MATCH.jaro_winkler_similarity (upper(diagnosis_name),upper('operation Knee right')) jws
11 FROM DATA t
12 ORDER BY jws DESC
13 /
DIAGNOSIS_NAME ICD_CO EDS JWS
-------------------- ------ ---------- ----------
Right Knee Operation IKR123 20 72
Knee Operation IK123 20 70
Heart Operation IH123 25 68
Left Knee Operation IKL123 25 64
Fever IF123 15 47
SQL>
因此,您将看到两者之间的差异。jaro_winkler_similarity在识别数据输入错误并提供最接近的匹配方面做得更好。基于此,只需按降序对第一行进行排序即可:
SQL> WITH DATA AS(
2 SELECT 'Heart Operation' diagnosis_name, 'IH123' icd_code FROM dual UNION ALL
3 SELECT 'Knee Operation' diagnosis_name, 'IK123' icd_code FROM dual UNION ALL
4 SELECT 'Left Knee Operation' diagnosis_name, 'IKL123' icd_code FROM dual UNION ALL
5 SELECT 'Right Knee Operation' diagnosis_name, 'IKR123' icd_code FROM dual UNION ALL
6 SELECT 'Fever' diagnosis_name, 'IF123' icd_code FROM dual
7 )
8 SELECT diagnosis_name
9 FROM
10 (SELECT t.*,
11 utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation Knee right')) eds,
12 UTL_MATCH.jaro_winkler_similarity (upper(diagnosis_name),upper('operation Knee right')) jws
13 FROM DATA t
14 ORDER BY jws DESC
15 )
16 WHERE rownum = 1
17 /
DIAGNOSIS_NAME
--------------------
Right Knee Operation
SQL>
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句