Tenant Management with SemPy: Settings, Audit Events, and Governance Patterns
Walkthrough of the new sempy.fabric.admin module: retrieving the configuration settings for tenants, auditing the activities performed on those settings, capturing everything into the lakehouse, and creating governance use cases such as config drift and activity dashboards.
I hosted a session in the Fabric Discord together with Teemu Multanen in which we looked at how one can automate Fabric tenant management without constantly going through the Admin Portal each time you want to see what setting is enabled or disabled. This post is the written version of that session.
While anyone who has worked with SemPy probably knows that, it should be noted nevertheless. The admin functionality of SemPy is fairly new, and existing documentation of SemPy mainly revolves around its semantics features rather than tenant governance capabilities.
What Are Tenant Settings?
Tenant settings are the governance knobs that control what your entire organization can do in Fabric. They live in the Admin Portal, under Tenant Settings, and they cover things like:
| Setting area | What it controls |
|---|---|
| Export | Whether users can export data to Excel or CSV |
| Publishing | Whether users can publish reports to the web |
| API access | Which security groups can call the REST APIs |
| Workspace creation | Whether users can create new workspaces |
| Copilot and AI | Access to Copilot-powered and Azure OpenAI features |
| Cross-tenant sharing | Whether data can be shared with external tenants |
You can read more about how these settings behave (the disabled/enabled/enabled-except-for-groups pattern) in About tenant settings, and the full catalog of what exists today is in the tenant settings index.
However, the issue with doing so entirely from the portal is the lack of a native change history feature. Disabling external sharing one Tuesday, forgetting to log it down, and three weeks down the line trying to figure out what caused a partner API to break is just not good practice anymore. The way to solve this is by extracting these settings into some code that keeps track of the configuration as changes are being made and alerts when something is off-kilter.
SemPy's Admin Module
sempy.fabric.admin is a fairly new addition to the SemPy package. It landed in semantic-link 0.14.0, and it's a big one: roughly 75 admin functions covering workspaces, capacities, domains, tenant settings, items, reports, datasets, tags, and activity events, all wrapping the official Fabric and Power BI admin REST APIs.
This module is new enough that your default Fabric runtime might be running an older SemPy version. Run %pip install -U semantic-link at the top of your notebook before importing sempy.fabric.admin, or you'll get an ImportError and wonder why a documented function doesn't exist.
Note: to be able to use the admin module, you need to be a Fabric Administrator
What I like about it compared to the older pattern you'll see in a lot of community blog posts (instantiate fabric.PowerBIRestClient(), call .get("v1/admin/tenantsettings"), parse the JSON yourself) is that auth is handled for you on Fabric compute. No token wrangling, no manually wiring up a FabricRestClient, and the functions return clean DataFrames instead of nested JSON you have to flatten.
Pulling the current tenant settings looks like this:
import pandas as pd
import sempy.fabric.admin as admin
# Get tenant settings
settings = admin.list_tenant_settings()
# Display all columns cleanly
pd.set_option("display.max_columns", None)
pd.set_option("display.max_colwidth", None)
pd.set_option("display.width", 200)
# Sort for readability
if "settingName" in settings.columns:
settings = settings.sort_values("settingName")
display(settings)
That's the whole thing. list_tenant_settings() returns a DataFrame with one row per setting:
| Column | What it holds |
|---|---|
| Setting Name | The setting's identifier, e.g. AllowExternalSharing |
| Title | The human-readable label shown in the Admin Portal |
| Enabled | Whether the setting is currently on |
| Tenant Setting Group | The category it belongs to |
| Can Specify Security Groups | Whether the setting supports group-level scoping |
| Enabled Security Groups | Which groups are included, if scoped |
Updating Tenant Settings
Updating tenant settings is just as easy.
import sempy.fabric.admin as admin
tenant_setting_name = "ArtifactOrgAppPreview"
admin.update_tenant_setting(
tenant_setting=tenant_setting_name,
enabled=True,
verbose=1
)
print(f"Updated tenant setting {tenant_setting_name} to enabled.")
Snapshotting Tenant Settings to a Lakehouse
A single call gives you a point-in-time view. To get any value out of this over time, you need to store it somewhere. I land it in a Delta table:
import json
import re
import hashlib
import pandas as pd
from datetime import datetime, timezone
import sempy.fabric.admin as admin
table_name = "tenant_settings_snapshot"
snapshot_utc = datetime.now(timezone.utc)
df = admin.list_tenant_settings()
def clean_col_name(col):
col = col.strip()
col = re.sub(r"[^A-Za-z0-9_]", "_", col)
col = re.sub(r"_+", "_", col)
return col.strip("_").lower()
df = df.rename(columns={col: clean_col_name(col) for col in df.columns})
# Convert list/dict/object-ish columns to JSON/text
for col in df.columns:
df[col] = df[col].apply(
lambda x: json.dumps(x, sort_keys=True, default=str)
if isinstance(x, (dict, list))
else x
)
# Force known problem columns to string, even when all values are null
string_columns = [
"setting_name",
"tenant_setting_group",
"enabled_security_groups",
"excluded_security_groups",
"security_groups",
"description",
"title"
]
for col in string_columns:
if col in df.columns:
df[col] = df[col].astype("string").fillna("")
df["snapshot_utc"] = snapshot_utc.isoformat()
df["snapshot_date"] = snapshot_utc.date().isoformat()
hash_cols = [c for c in df.columns if c not in ["snapshot_utc", "snapshot_date"]]
def row_hash(row):
payload = json.dumps({c: row[c] for c in hash_cols}, sort_keys=True, default=str)
return hashlib.sha256(payload.encode("utf-8")).hexdigest()
df["setting_hash"] = df.apply(row_hash, axis=1)
spark_df = spark.createDataFrame(df)
# Extra safety: cast any NullType columns to string
for field in spark_df.schema.fields:
if field.dataType.simpleString() == "void":
spark_df = spark_df.withColumn(field.name, spark_df[field.name].cast("string"))
spark_df.write \
.format("delta") \
.mode("append") \
.option("mergeSchema", "true") \
.saveAsTable(table_name)
print(f"Saved {spark_df.count()} tenant settings to {table_name}")
It's worth noting that cleaning the columns makes a significant difference. The column Enabled Security Groups will then be a list of dicts within the dataframe, and this won't allow Spark to determine its schema. This is why I serialize everything that looks like a dictionary or a list into JSON format before writing. The mergeSchema property exists because at times, Microsoft adds additional tenant properties.
Pattern: Config Drift Detection
This is why you want to take snapshots each day as opposed to just running the function whenever you remember.
Schedule the snapshot: a Data Pipeline executes the notebook above each day and appends the entire settings snapshot into the Delta table without requiring any action from anyone.
Diff against the previous run: a left join operation performed on the current snapshot and yesterday's snapshot using setting_name as the key. Any row where there is a change in the value of enabled in both the snapshots represents drift. It is nothing more than just writing a SQL script against the snapshot table.
Alert and document: writing the drift log in another Delta table called drift_log and building a Power BI report on the same representing all the changes made to each setting across time. Next time somebody asks about whether something was changed related to the sharing permission last month, you can give them an answer.
from pyspark.sql import functions as F
from pyspark.sql.window import Window
table_name = "tenant_settings_snapshot"
df = spark.table(table_name)
# Get latest and previous snapshot timestamps
snapshot_rows = (
df.select("snapshot_utc")
.distinct()
.orderBy(F.col("snapshot_utc").desc())
.limit(2)
.collect()
)
if len(snapshot_rows) < 2:
print("Not enough snapshots to compare. Capture at least two snapshots first.")
else:
latest_snapshot = snapshot_rows[0]["snapshot_utc"]
previous_snapshot = snapshot_rows[1]["snapshot_utc"]
latest_df = (
df.filter(F.col("snapshot_utc") == latest_snapshot)
.alias("l")
)
previous_df = (
df.filter(F.col("snapshot_utc") == previous_snapshot)
.alias("p")
)
changes_df = (
latest_df
.join(previous_df, on="setting_name", how="full_outer")
.select(
F.coalesce(F.col("l.setting_name"), F.col("p.setting_name")).alias("setting_name"),
F.col("p.snapshot_utc").alias("previous_snapshot_utc"),
F.col("l.snapshot_utc").alias("latest_snapshot_utc"),
F.col("p.enabled").alias("previous_enabled"),
F.col("l.enabled").alias("latest_enabled"),
F.col("p.enabled_security_groups").alias("previous_enabled_security_groups"),
F.col("l.enabled_security_groups").alias("latest_enabled_security_groups"),
F.col("p.setting_hash").alias("previous_setting_hash"),
F.col("l.setting_hash").alias("latest_setting_hash"),
F.when(F.col("p.setting_name").isNull(), F.lit("Added"))
.when(F.col("l.setting_name").isNull(), F.lit("Removed"))
.when(F.col("p.setting_hash") != F.col("l.setting_hash"), F.lit("Changed"))
.otherwise(F.lit("Unchanged"))
.alias("change_type")
)
.filter(F.col("change_type") != "Unchanged")
.orderBy("change_type", "setting_name")
)
change_count = changes_df.count()
if change_count == 0:
print(
f"No tenant setting changes detected.\n"
f"Previous snapshot: {previous_snapshot}\n"
f"Latest snapshot : {latest_snapshot}"
)
else:
print(f"{change_count} tenant setting change(s) detected.")
print(f"Previous snapshot: {previous_snapshot}")
print(f"Latest snapshot : {latest_snapshot}")
print()
for row in changes_df.collect():
print("=" * 100)
print(f"Setting : {row['setting_name']}")
print(f"Change : {row['change_type']}")
if row["change_type"] == "Added":
print(f"Enabled : {row['latest_enabled']}")
print(f"Groups : {row['latest_enabled_security_groups']}")
elif row["change_type"] == "Removed":
print(f"Previous Enabled : {row['previous_enabled']}")
print(f"Previous Groups : {row['previous_enabled_security_groups']}")
elif row["change_type"] == "Changed":
print()
print("Enabled:")
print(f" Previous: {row['previous_enabled']}")
print(f" Latest : {row['latest_enabled']}")
print()
print("Security Groups:")
print(f" Previous: {row['previous_enabled_security_groups']}")
print(f" Latest : {row['latest_enabled_security_groups']}")
The portal does show a banner when Microsoft adds or changes a setting, but it says nothing about changes your own admins made. The drift pattern above catches both, which is the whole point.
Audit Events via the Power BI REST API
Tenant settings show you what can be done. Audit operations show you what is being done, which is the other half of governance information that can’t be provided by settings alone.
The GetActivityEvents Endpoint
Note that although this is a REST API from Power BI and not from Fabric, it’s the source of information on tenant-wide activities. It’s wrapped by sempy.fabric.admin. Non-exhaustive list of operation types:
| Operation | What it tells you |
|---|---|
| ViewReport | Who viewed which report, and when |
| ExportReport | PDF, PowerPoint, and image exports |
| ExportData | Underlying dataset data exports |
| DeleteDataset | Dataset deletions |
| ShareDashboard | Dashboard sharing events |
| CreateApp | App creation |
| AnalyzeInExcel | Open-in-Excel and other data extracts |
Once you have collected enough data over a number of weeks, asking questions such as which reports have never been accessed, who exports the most data, and whether anyone accessed this data set before deleting it become answerable questions.
Key Constraints to Know Upfront
Before beginning to write the script, it is best to plan both the loop and the required permissions.
| Constraint | Detail |
|---|---|
| Same UTC day window | start_time and end_time must fall within the same UTC day. A single call covers up to 24 hours. |
| Lookback limit | Events older than 28 days aren't available through this endpoint. |
| Fabric Administrator required | The caller must be a Fabric administrator, or authenticate with a service principal. |
| Pagination | Large result sets return a continuation token; list_activity_events() handles paging internally, so you get back a single complete DataFrame for the window you asked for. |
| Rate limit | 200 requests per hour for activity events (compare to 25 requests per minute for tenant settings reads and writes). |
You'll see "30 days" quoted constantly in community blog posts and even in some conference decks (mine included, until I checked). The current Admin - Get Activity Events reference on Microsoft Learn says 28 days, not 30. If you're building a retention strategy around this, build in a buffer rather than cutting it close on day 29.
Pulling Events with list_activity_events()
from datetime import datetime, timezone, time
import sempy.fabric.admin as admin
today_utc = datetime.now(timezone.utc).date()
start_time = datetime.combine(today_utc, time.min).strftime("%Y-%m-%dT%H:%M:%S")
end_time = datetime.combine(today_utc, time.max).strftime("%Y-%m-%dT%H:%M:%S")
events = admin.list_activity_events(
start_time=start_time,
end_time=end_time,
activity="ViewReport", # optional
)
display(events)
Pass in start_time and end_time (they have to be from the same day, UTC-wise, or else the function will raise a ValueError) as well as activity or user_id for filtering purposes, if desired, to avoid getting all the records and filtering them later in pandas.
This is a point of confusion when porting scripts that directly used the raw REST API because, here, the DataFrame columns are in readable form, rather than camelCase. That means df["User Id"] and df["Workspace Name"], not df["UserId"] and df["WorkspaceName"].
Pattern: Lightweight Activity Dashboard
The loop isn't all too difficult if you think of the same-UTC-day limit as our basic unit of work: loop through the days, write those days' worth of data into the lakehouse, create a Direct Lake semantic model layer above that, and report from the latter.
A Data Pipeline kicks off a notebook, executed daily, that performs its queries (just one query per day as we've established from the API's constraints), writes into a Delta table partitioned by date, and is done. Not really a loop at all unless you're doing any backfilling, in which case, you'll have to make a call every single day for your backfilling range because of the above same-day constraint.
From there, just create a Direct Lake semantic model reading from the lakehouse: no imports, no scheduled refreshes required to keep an eye on that. In addition, a few views will do most of the reporting work for you here: reports that show the views you don't have many views on, user exports to identify heavy exporters, activity heat maps per hour to see what times of the day put the most stress on your capacity, and finally a trend line of the past 30 days.
After the data is safely stored away, the next step is to combine the tenant settings snapshot with the activity events data to paint a picture of who changed what and when.
import pandas as pd
from datetime import datetime, timezone, time
from pyspark.sql import functions as F
import sempy.fabric.admin as admin
table_name = "tenant_settings_snapshot"
# ------------------------------------------------------------
# Helpers
# ------------------------------------------------------------
def get_col(row, *names):
for name in names:
value = row.get(name, "")
if value not in [None, ""]:
return value
return ""
def clean_col_name(col):
import re
col = col.strip()
col = re.sub(r"[^A-Za-z0-9_]", "_", col)
col = re.sub(r"_+", "_", col)
return col.strip("_").lower()
# ------------------------------------------------------------
# 1. Detect changed tenant settings
# ------------------------------------------------------------
df = spark.table(table_name)
snapshot_rows = (
df.select("snapshot_utc")
.distinct()
.orderBy(F.col("snapshot_utc").desc())
.limit(2)
.collect()
)
if len(snapshot_rows) < 2:
print("Not enough snapshots to compare. Need at least two snapshots.")
else:
latest_snapshot = snapshot_rows[0]["snapshot_utc"]
previous_snapshot = snapshot_rows[1]["snapshot_utc"]
latest_df = df.filter(F.col("snapshot_utc") == latest_snapshot).alias("l")
previous_df = df.filter(F.col("snapshot_utc") == previous_snapshot).alias("p")
changes_df = (
latest_df
.join(previous_df, on="setting_name", how="full_outer")
.select(
F.coalesce(F.col("l.setting_name"), F.col("p.setting_name")).alias("setting_name"),
F.col("p.snapshot_utc").alias("previous_snapshot_utc"),
F.col("l.snapshot_utc").alias("latest_snapshot_utc"),
F.col("p.enabled").alias("previous_enabled"),
F.col("l.enabled").alias("latest_enabled"),
F.col("p.enabled_security_groups").alias("previous_enabled_security_groups"),
F.col("l.enabled_security_groups").alias("latest_enabled_security_groups"),
F.col("p.setting_hash").alias("previous_setting_hash"),
F.col("l.setting_hash").alias("latest_setting_hash"),
F.when(F.col("p.setting_name").isNull(), F.lit("Added"))
.when(F.col("l.setting_name").isNull(), F.lit("Removed"))
.when(F.col("p.setting_hash") != F.col("l.setting_hash"), F.lit("Changed"))
.otherwise(F.lit("Unchanged"))
.alias("change_type")
)
.filter(F.col("change_type") != "Unchanged")
.orderBy("setting_name")
)
changes = changes_df.collect()
if not changes:
print("No tenant setting changes detected.")
print(f"Previous snapshot: {previous_snapshot}")
print(f"Latest snapshot : {latest_snapshot}")
else:
print(f"{len(changes)} tenant setting change(s) detected.")
print(f"Previous snapshot: {previous_snapshot}")
print(f"Latest snapshot : {latest_snapshot}")
print()
# ------------------------------------------------------------
# 2. Get UpdatedAdminFeatureSwitch audit events
# ------------------------------------------------------------
previous_dt = datetime.fromisoformat(previous_snapshot.replace("Z", "+00:00"))
latest_dt = datetime.fromisoformat(latest_snapshot.replace("Z", "+00:00"))
audit_day = latest_dt.date()
start_time = datetime.combine(audit_day, time.min).strftime("%Y-%m-%dT%H:%M:%S")
end_time = datetime.combine(audit_day, time.max).strftime("%Y-%m-%dT%H:%M:%S")
events = admin.list_activity_events(
start_time=start_time,
end_time=end_time
)
events_pdf = events if isinstance(events, pd.DataFrame) else pd.DataFrame(events)
if events_pdf.empty:
updated_events = pd.DataFrame()
else:
events_pdf.columns = [clean_col_name(c) for c in events_pdf.columns]
updated_events = events_pdf[
events_pdf.apply(
lambda row: row.astype(str).str.contains(
"UpdatedAdminFeatureSwitch",
case=False,
na=False
).any(),
axis=1
)
].copy()
if "creation_time" in updated_events.columns:
updated_events["creation_time_parsed"] = pd.to_datetime(
updated_events["creation_time"],
errors="coerce",
utc=True
)
updated_events = updated_events[
(updated_events["creation_time_parsed"] >= previous_dt) &
(updated_events["creation_time_parsed"] <= latest_dt)
]
# ------------------------------------------------------------
# 3. Friendly output
# ------------------------------------------------------------
for change in changes:
print("=" * 120)
print(f"Tenant setting changed: {change['setting_name']}")
print(f"Change type : {change['change_type']}")
print()
print("Enabled:")
print(f" Previous: {change['previous_enabled']}")
print(f" Latest : {change['latest_enabled']}")
print()
print("Security Groups:")
print(f" Previous: {change['previous_enabled_security_groups']}")
print(f" Latest : {change['latest_enabled_security_groups']}")
print()
if updated_events.empty:
print("Changed by: No matching UpdatedAdminFeatureSwitch audit event found in the snapshot window.")
else:
print("Possible UpdatedAdminFeatureSwitch actor event(s):")
for _, event in updated_events.iterrows():
print("-" * 80)
print(f"Time : {get_col(event, 'creation_time')}")
print(f"User : {get_col(event, 'user_id', 'user_key')}"[:21])
print(f"Activity : {get_col(event, 'activity')}")
print(f"Operation : {get_col(event, 'operation')}")
print(f"Is Success: {get_col(event, 'is_success')}")
print(f"Raw object : {get_col(event, 'object_id', 'object_display_name', 'item_name')}")
print()
Putting It Together: A Governance Story
State and activity provide answers to two distinct queries, and both are required to obtain a comprehensive understanding.
The list_tenant_settings() call shows you current capabilities, such as enabled services, scoped groups, and even the time since those things were last changed when snapshotting daily. The list_activity_events() call gives you an idea of what folks are up to with their access rights, based on whatever period of time you've retained.
Combined, these give the answer to the important query posed by all governance discussions: not just whether or not the setting is turned on, but who's making use of whatever it has provided access to and whether they've been behaving as expected.
If your next step is to start flipping tenant settings programmatically with update_tenant_setting, know that the underlying Tenants - Update Tenant Setting REST API is still marked preview as of this writing. I'd treat any settings-as-code pipeline built on it as something to watch closely rather than something to fully trust unattended, at least until it graduates to GA.
Key Takeaways
sempy.fabric.adminis new (it shipped insemantic-link0.14.0), so make sure you're on a current package version before reaching for it.list_tenant_settings()andlist_activity_events()both return clean, humanized DataFrames with auth handled for you. NoFabricRestClient, no manual JSON parsing.- Snapshot tenant settings daily to a Delta table if you want config drift detection. A point-in-time read alone gives you no history to diff against.
- Audit events are constrained to a same-UTC-day window per call and a 28-day lookback (not 30, despite what you'll read elsewhere), and require Fabric Administrator rights or a service principal.
- Config state and activity history are complementary, not redundant. One tells you what's allowed, the other tells you what's actually happening.
If you want to build either of these patterns, the full notebooks from the session are linked in the Fabric Discord, and I'm happy to answer questions there if you get stuck on the lakehouse or Direct Lake side of things.
References
Join the Community
Connect with fellow Fabric enthusiasts on Discord. Get help, share insights, and stay up to date with the latest discussions.
Join Discord Server