In [1]:
import os
os.chdir('..')


In [2]:
from IPython.display import clear_output
from kg_rag.utility import *

clear_output()


In [3]:

def generate_response(question, llm, kg_rag_flag, evidence_flag=False, temperature=0):
    CHAT_MODEL_ID = llm
    CHAT_DEPLOYMENT_ID = llm
    
    if kg_rag_flag:
        SYSTEM_PROMPT = system_prompts["KG_RAG_BASED_TEXT_GENERATION"]
        CONTEXT_VOLUME = int(config_data["CONTEXT_VOLUME"])
        QUESTION_VS_CONTEXT_SIMILARITY_PERCENTILE_THRESHOLD = float(config_data["QUESTION_VS_CONTEXT_SIMILARITY_PERCENTILE_THRESHOLD"])
        QUESTION_VS_CONTEXT_MINIMUM_SIMILARITY = float(config_data["QUESTION_VS_CONTEXT_MINIMUM_SIMILARITY"])
        VECTOR_DB_PATH = config_data["VECTOR_DB_PATH"]
        NODE_CONTEXT_PATH = config_data["NODE_CONTEXT_PATH"]
        SENTENCE_EMBEDDING_MODEL_FOR_NODE_RETRIEVAL = config_data["SENTENCE_EMBEDDING_MODEL_FOR_NODE_RETRIEVAL"]
        SENTENCE_EMBEDDING_MODEL_FOR_CONTEXT_RETRIEVAL = config_data["SENTENCE_EMBEDDING_MODEL_FOR_CONTEXT_RETRIEVAL"]
        vectorstore = load_chroma(VECTOR_DB_PATH, SENTENCE_EMBEDDING_MODEL_FOR_NODE_RETRIEVAL)
        embedding_function_for_context_retrieval = load_sentence_transformer(SENTENCE_EMBEDDING_MODEL_FOR_CONTEXT_RETRIEVAL)
        node_context_df = pd.read_csv(NODE_CONTEXT_PATH)
        context = retrieve_context(question, vectorstore, embedding_function_for_context_retrieval, node_context_df, CONTEXT_VOLUME, QUESTION_VS_CONTEXT_SIMILARITY_PERCENTILE_THRESHOLD, QUESTION_VS_CONTEXT_MINIMUM_SIMILARITY, evidence_flag)
        enriched_prompt = "Context: "+ context + "\n" + "Question: " + question
        question = enriched_prompt
    else:
        SYSTEM_PROMPT = system_prompts["PROMPT_BASED_TEXT_GENERATION"]
    
    output = get_GPT_response(question, SYSTEM_PROMPT, CHAT_MODEL_ID, CHAT_DEPLOYMENT_ID, temperature=temperature)
    stream_out(output)


In [4]:

LLM_TO_USE = 'gpt-4'
TEMPERATURE = config_data["LLM_TEMPERATURE"]


## Question 1:

In [5]:
question = 'Are there any latest drugs used for weight management in patients with Bardet-Biedl Syndrome?'


### With KG-RAG

In [6]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True #Used only when KG_RAG_FLAG=True

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, the compound Setmelanotide is used to treat Bardet-Biedl syndrome. It is currently in phase 3 of clinical trials according to the sources ChEMBL and DrugCentral. However, it is advised to seek guidance from a healthcare professional for the most current and personalized treatment options. [Provenance: ChEMBL, DrugCentral]



### Without KG-RAG

In [7]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


As of my knowledge up to date, there are no specific drugs designed for weight management in patients with Bardet-Biedl Syndrome. The treatment generally involves managing the symptoms and complications. However, any new developments would be best advised by a healthcare professional.



## Question 2:

In [8]:
question = 'Is it PNPLA3 or HLA-B that has a significant association with the disease liver benign neoplasm?'


### With KG-RAG

In [9]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True #Used only when KG_RAG_FLAG=True

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


The gene PNPLA3 has a more significant association with the disease liver benign neoplasm, as indicated by the lower p-value of 4e-14 compared to HLA-B's p-value of 2e-08. The provenance of this information is the GWAS Catalog.



### Without KG-RAG

In [10]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


It is PNPLA3 that has a significant association with the disease liver benign neoplasm.



## Question 3

In [11]:
question = "Is Parkinson's disease associated with PINK1 gene?"


In [12]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, Parkinson's disease is associated with the PINK1 gene. This association is reported in the DISEASES database - https://diseases.jensenlab.org.



## Question 3- perturbed (entities in smaller case)

In [13]:
question = "Is parkinson's disease associated with pink1 gene?"


In [14]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, Parkinson's disease is associated with the PINK1 gene. This association is reported in the DISEASES database - https://diseases.jensenlab.org.



## Question 4:

In [15]:
question = "What are some protein markers associated with thoracic aortic aneurysm?"


### With KG-RAG

In [16]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


The protein markers associated with thoracic aortic aneurysm include Chondroitin sulfate proteoglycan 4 (CSPG4), Matrix Gla protein (MGP), Interleukin-2 receptor subunit alpha (IL2RA), Interleukin-1 beta (IL1B), Myosin-10 (MYH10), Tropomyosin alpha-4 chain (TPM4), Tyrosine-protein kinase Mer (MERTK), and Stabilin-1 (STAB1). The provenance of these associations is the Cell Taxonomy database.



### Without KG-RAG

In [17]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


Some protein markers associated with thoracic aortic aneurysm include Matrix metalloproteinases (MMPs), C-reactive protein (CRP), and fibrillin-1.



## Question 5:

In [18]:
question = "Are there any protein markers that show increased activity in adenocarcinoma?"


### With KG-RAG

In [19]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, there are several protein markers that show increased activity in adenocarcinoma. These include Keratin, type II cytoskeletal 7 (Cytokeratin-7) (CK-7) (Keratin-7) (K7) (Sarcolectin) (Type-II keratin Kb7), Anterior gradient protein 2 homolog (AG-2) (hAG-2) (HPC8) (Secreted cement gland protein XAG-2 homolog), Guanine deaminase (Guanase) (Guanine aminase) (3.5.4.3) (Guanine aminohydrolase) (GAH) (p51-nedasin), and Graves disease carrier protein (GDC) (Graves disease autoantigen) (GDA) (Mitochondrial solute carrier protein homolog) (Solute carrier family 25 member 16). The provenance of these associations is the Cell Taxonomy.



### Without KG-RAG

In [20]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


Yes, certain protein markers like carcinoembryonic antigen (CEA), CA 19-9, and cytokeratins can show increased activity in adenocarcinoma.



## Question 6:

In [21]:
question = "Do you know if ruxolitinib is approved as a pharmacologic treatment for vitiligo?"


### With KG-RAG

In [22]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, Ruxolitinib is associated with the treatment of vitiligo. This association is supported by data from ChEMBL and DrugCentral databases. However, it is always recommended to seek guidance from a healthcare professional for treatment options. (Provenance: ChEMBL, DrugCentral)



### Without KG-RAG

In [23]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


As of my knowledge up to date, ruxolitinib is not officially approved for the treatment of vitiligo. However, some clinical trials have shown promising results. Always consult with a healthcare provider for treatment options.



## Question 7:

In [24]:
question = "Are there any biomarkers that show increased profile in hydrocephalus?"


### With KG-RAG

In [25]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, the Transmembrane protein 119 (TMEM119) and P2Y purinoceptor 12 (P2RY12) show an increased profile in hydrocephalus. The provenance of this information is the Cell Taxonomy database.



### Without KG-RAG

In [26]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


Yes, certain biomarkers such as L1CAM, S100B, GFAP, and NSE have shown increased profiles in hydrocephalus.



## Question 8:

In [27]:

question = 'Does drug dependence have any genetic factors? Do you have any statistical evidence from trustworthy sources for this?'


### With KG-RAG

In [28]:
KG_RAG_FLAG = True
EDGE_EVIDENCE_FLAG = True 

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, evidence_flag=EDGE_EVIDENCE_FLAG, temperature=TEMPERATURE)


Yes, drug dependence does have genetic factors. The genes KAT2B and SLC25A16 have been associated with drug dependence. This information is backed by statistical evidence from the GWAS Catalog, with p-values of 4e-10 and 1e-09 respectively.



### Without KG-RAG

In [29]:
KG_RAG_FLAG = False

generate_response(question, LLM_TO_USE, KG_RAG_FLAG, temperature=TEMPERATURE)


Yes, drug dependence does have genetic factors. According to the National Institute on Drug Abuse, genetics account for about 40-60% of a person's vulnerability to drug addiction.

