0
votes

I'm trying to understand the Enterprise Guide process flow. As I understand it, the process flow is supposed to make it easy to run related steps in the order they need to be run to make a dependent action able to run and be up to date somewhere later in the flow.

Given that understanding, I'm getting stuck trying to make the process flow work in cases where the temporary data is purged. I'm warned when closing Enterprise Guide that the project has references to temporary data which must be the tables I created. That should be fine, the data is on the SAS server and I wrote code to import that data into SAS.

I would expect that the data can be regenerated when I try run an analysis that depends on that data again later, but instead I'm getting an error indicating that the input data does not exist. If I then run the code to import the data and/or join tables in each necessary place, the process flow seems to work as expected.

See the flow that I'm working with below: process flow

I'm sure I must be missing something. Imagine I want to rerun the rightmost linear regression. Is there a way to make the process flow import the data without doing so manually for each individual table creation the first time round?

1
So you're (later) running Linear Regression (2) and want to know why the Code For Import Data steps aren't automatically run?Joe
If you want the temporary tables available later you need to save them to a project library. I would create a project library and them move the temp data sets to there. So rather than WORK they're assigned to myLIB. This means you may store more data than you want but if you don't have the data save, then yes, you need to re-run your full process to recreate the data.Reeza
@Joe Yes, that's what I'm wondering.Joey Harwood
'update' updates by rerunning that one step, not by rerunning the whole PF up to that point (unfortunately).Joe
@JoeyHarwood It runs the process moving 'forward' in the diagram, but doesn't check if the previous steps have run. An option for that would be useful - and I thought it had it to be honest. Worth putting this suggestion on the EG ballot list if it's not there already.Reeza

1 Answers

1
votes

The general answer to your question is probably that you can't really do what you're wanting directly, but you can do it indirectly.

A process flow (of which you can have many per project, don't forget) is a single set of programs/tasks/etc. that you intend to run as a group. Typically, you will run whole process flows at once, rather than just individual pieces. If you have a point that you want to pause, look at things, then continue, then you have a few choices.

One is to have a process flow that goes to that point, then a second process flow that starts from that point. You can even take your 'import data' steps out of the process flow entirely, make an 'import data' process flow, always run that first, then run the other process flows individually as you need them. In fact, if you use the AUTOEXEC process flow, you could have the import data steps run whenever you open the project, and imported data ready and waiting for you.

A second is to use the UI and control+click or drag a box to select on the process flow to select a group of programs to run; select the first five, say, then run them, then select 'run branch from program...' option to run from that point on. You could also make separate 'branches' and run just the one branch at a time, making each branch dependent on the input streams.

A third option would be to have different starting points for different analysis tasks, and have the import data bit be after that starting point. It could be common to the starting points, and use macro variables and conditional execution to go different directions. For example, you could have a macro variable set in the first program that says which analysis program you're running, then the conditional from the last import step (which are in sequence, not in parallel like you have them) send you off to whatever analysis task the macro variable says. You could also have macro variables that indicate whether an import has been run once already in the current session that then would tell you not to rerun it via conditional steps.

Unfortunately, though, there's no direct way to run something and say 'run this and all of its dependencies', though.