Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

python - Pass dynamically variables to shell script in databricks notebooks

So this question is similar to How to pass a python variables to shell script in azure databricks notebookbles.? but it's slightly different.

I have a notebook that runs other notebook few times with different arguments and the issue is one of the arguments needs to be environmental variable used by shell (in this case I pass the variable that is the name of the directory where I clone git repo to). Everything works fine if I run notebook by notebook. However I was hoping to run as threads.And in that case env variables are overwritten (obviously). I was hoping than there is some smarter way to pass variables between python and shell to avoid overwriting it.

Caller notebook:

from multiprocessing.pool import ThreadPool
pool = ThreadPool(10)
pool.starmap(
  lambda schema_name,model_name,branch_name: dbutils.notebook.run(
    "testing_workflow",
    timeout_seconds = 360,
    arguments = {"schema_name":schema_name, 
                 "model_name":model_name, 
                 "branch_name":branch_name}),
  (["prod","sp","develop"], ["dev","sp","DE-1006"])
)

Callee notebook

import os
import json
os.environ["git_checkout"] = f"""git clone https://{dbutils.secrets.get(scope = "dev", key = "github_key")}@github.com/xxx/image.git --branch {dbutils.widgets.get("branch_name")} {dbutils.widgets.get("schema_name")}"""
os.environ["schema_name"] = "{0}".format(dbutils.widgets.get("schema_name"))
%sh ${git_checkout}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...