Skip to main content
15 min read

Tenant Management with SemPy: Settings, Audit Events, and Governance Patterns

Walkthrough of the new sempy.fabric.admin module: retrieving the configuration settings for tenants, auditing the activities performed on those settings, capturing everything into the lakehouse, and creating governance use cases such as config drift and activity dashboards.

I hosted a session in the Fabric Discord together with Teemu Multanen in which we looked at how one can automate Fabric tenant management without constantly going through the Admin Portal each time you want to see what setting is enabled or disabled. This post is the written version of that session.

While anyone who has worked with SemPy probably knows that, it should be noted nevertheless. The admin functionality of SemPy is fairly new, and existing documentation of SemPy mainly revolves around its semantics features rather than tenant governance capabilities.

What Are Tenant Settings?

Tenant settings are the governance knobs that control what your entire organization can do in Fabric. They live in the Admin Portal, under Tenant Settings, and they cover things like:

Setting areaWhat it controls
ExportWhether users can export data to Excel or CSV
PublishingWhether users can publish reports to the web
API accessWhich security groups can call the REST APIs
Workspace creationWhether users can create new workspaces
Copilot and AIAccess to Copilot-powered and Azure OpenAI features
Cross-tenant sharingWhether data can be shared with external tenants

You can read more about how these settings behave (the disabled/enabled/enabled-except-for-groups pattern) in About tenant settings, and the full catalog of what exists today is in the tenant settings index.

However, the issue with doing so entirely from the portal is the lack of a native change history feature. Disabling external sharing one Tuesday, forgetting to log it down, and three weeks down the line trying to figure out what caused a partner API to break is just not good practice anymore. The way to solve this is by extracting these settings into some code that keeps track of the configuration as changes are being made and alerts when something is off-kilter.

SemPy's Admin Module

sempy.fabric.admin is a fairly new addition to the SemPy package. It landed in semantic-link 0.14.0, and it's a big one: roughly 75 admin functions covering workspaces, capacities, domains, tenant settings, items, reports, datasets, tags, and activity events, all wrapping the official Fabric and Power BI admin REST APIs.

This module is new enough that your default Fabric runtime might be running an older SemPy version. Run %pip install -U semantic-link at the top of your notebook before importing sempy.fabric.admin, or you'll get an ImportError and wonder why a documented function doesn't exist.

Note: to be able to use the admin module, you need to be a Fabric Administrator

What I like about it compared to the older pattern you'll see in a lot of community blog posts (instantiate fabric.PowerBIRestClient(), call .get("v1/admin/tenantsettings"), parse the JSON yourself) is that auth is handled for you on Fabric compute. No token wrangling, no manually wiring up a FabricRestClient, and the functions return clean DataFrames instead of nested JSON you have to flatten.

Pulling the current tenant settings looks like this:

Python
import pandas as pd
import sempy.fabric.admin as admin

# Get tenant settings
settings = admin.list_tenant_settings()

# Display all columns cleanly
pd.set_option("display.max_columns", None)
pd.set_option("display.max_colwidth", None)
pd.set_option("display.width", 200)

# Sort for readability
if "settingName" in settings.columns:
    settings = settings.sort_values("settingName")

display(settings)

That's the whole thing. list_tenant_settings() returns a DataFrame with one row per setting:

ColumnWhat it holds
Setting NameThe setting's identifier, e.g. AllowExternalSharing
TitleThe human-readable label shown in the Admin Portal
EnabledWhether the setting is currently on
Tenant Setting GroupThe category it belongs to
Can Specify Security GroupsWhether the setting supports group-level scoping
Enabled Security GroupsWhich groups are included, if scoped

Updating Tenant Settings

Updating tenant settings is just as easy.

Python
import sempy.fabric.admin as admin

tenant_setting_name = "ArtifactOrgAppPreview"

admin.update_tenant_setting(
    tenant_setting=tenant_setting_name,
    enabled=True,
    verbose=1
)

print(f"Updated tenant setting {tenant_setting_name} to enabled.")

Snapshotting Tenant Settings to a Lakehouse

A single call gives you a point-in-time view. To get any value out of this over time, you need to store it somewhere. I land it in a Delta table:

Python
import json
import re
import hashlib
import pandas as pd
from datetime import datetime, timezone
import sempy.fabric.admin as admin

table_name = "tenant_settings_snapshot"
snapshot_utc = datetime.now(timezone.utc)

df = admin.list_tenant_settings()

def clean_col_name(col):
    col = col.strip()
    col = re.sub(r"[^A-Za-z0-9_]", "_", col)
    col = re.sub(r"_+", "_", col)
    return col.strip("_").lower()

df = df.rename(columns={col: clean_col_name(col) for col in df.columns})

# Convert list/dict/object-ish columns to JSON/text
for col in df.columns:
    df[col] = df[col].apply(
        lambda x: json.dumps(x, sort_keys=True, default=str)
        if isinstance(x, (dict, list))
        else x
    )

# Force known problem columns to string, even when all values are null
string_columns = [
    "setting_name",
    "tenant_setting_group",
    "enabled_security_groups",
    "excluded_security_groups",
    "security_groups",
    "description",
    "title"
]

for col in string_columns:
    if col in df.columns:
        df[col] = df[col].astype("string").fillna("")

df["snapshot_utc"] = snapshot_utc.isoformat()
df["snapshot_date"] = snapshot_utc.date().isoformat()

hash_cols = [c for c in df.columns if c not in ["snapshot_utc", "snapshot_date"]]

def row_hash(row):
    payload = json.dumps({c: row[c] for c in hash_cols}, sort_keys=True, default=str)
    return hashlib.sha256(payload.encode("utf-8")).hexdigest()

df["setting_hash"] = df.apply(row_hash, axis=1)

spark_df = spark.createDataFrame(df)

# Extra safety: cast any NullType columns to string
for field in spark_df.schema.fields:
    if field.dataType.simpleString() == "void":
        spark_df = spark_df.withColumn(field.name, spark_df[field.name].cast("string"))

spark_df.write \
    .format("delta") \
    .mode("append") \
    .option("mergeSchema", "true") \
    .saveAsTable(table_name)

print(f"Saved {spark_df.count()} tenant settings to {table_name}")

It's worth noting that cleaning the columns makes a significant difference. The column Enabled Security Groups will then be a list of dicts within the dataframe, and this won't allow Spark to determine its schema. This is why I serialize everything that looks like a dictionary or a list into JSON format before writing. The mergeSchema property exists because at times, Microsoft adds additional tenant properties.

Pattern: Config Drift Detection

This is why you want to take snapshots each day as opposed to just running the function whenever you remember.

Schedule the snapshot: a Data Pipeline executes the notebook above each day and appends the entire settings snapshot into the Delta table without requiring any action from anyone.

Diff against the previous run: a left join operation performed on the current snapshot and yesterday's snapshot using setting_name as the key. Any row where there is a change in the value of enabled in both the snapshots represents drift. It is nothing more than just writing a SQL script against the snapshot table.

Alert and document: writing the drift log in another Delta table called drift_log and building a Power BI report on the same representing all the changes made to each setting across time. Next time somebody asks about whether something was changed related to the sharing permission last month, you can give them an answer.

Python
from pyspark.sql import functions as F
from pyspark.sql.window import Window

table_name = "tenant_settings_snapshot"

df = spark.table(table_name)

# Get latest and previous snapshot timestamps
snapshot_rows = (
    df.select("snapshot_utc")
      .distinct()
      .orderBy(F.col("snapshot_utc").desc())
      .limit(2)
      .collect()
)

if len(snapshot_rows) < 2:
    print("Not enough snapshots to compare. Capture at least two snapshots first.")
else:
    latest_snapshot = snapshot_rows[0]["snapshot_utc"]
    previous_snapshot = snapshot_rows[1]["snapshot_utc"]

    latest_df = (
        df.filter(F.col("snapshot_utc") == latest_snapshot)
          .alias("l")
    )

    previous_df = (
        df.filter(F.col("snapshot_utc") == previous_snapshot)
          .alias("p")
    )

    changes_df = (
        latest_df
        .join(previous_df, on="setting_name", how="full_outer")
        .select(
            F.coalesce(F.col("l.setting_name"), F.col("p.setting_name")).alias("setting_name"),

            F.col("p.snapshot_utc").alias("previous_snapshot_utc"),
            F.col("l.snapshot_utc").alias("latest_snapshot_utc"),

            F.col("p.enabled").alias("previous_enabled"),
            F.col("l.enabled").alias("latest_enabled"),

            F.col("p.enabled_security_groups").alias("previous_enabled_security_groups"),
            F.col("l.enabled_security_groups").alias("latest_enabled_security_groups"),

            F.col("p.setting_hash").alias("previous_setting_hash"),
            F.col("l.setting_hash").alias("latest_setting_hash"),

            F.when(F.col("p.setting_name").isNull(), F.lit("Added"))
             .when(F.col("l.setting_name").isNull(), F.lit("Removed"))
             .when(F.col("p.setting_hash") != F.col("l.setting_hash"), F.lit("Changed"))
             .otherwise(F.lit("Unchanged"))
             .alias("change_type")
        )
        .filter(F.col("change_type") != "Unchanged")
        .orderBy("change_type", "setting_name")
    )

    change_count = changes_df.count()

    if change_count == 0:
        print(
            f"No tenant setting changes detected.\n"
            f"Previous snapshot: {previous_snapshot}\n"
            f"Latest snapshot  : {latest_snapshot}"
        )
    else:
        print(f"{change_count} tenant setting change(s) detected.")
        print(f"Previous snapshot: {previous_snapshot}")
        print(f"Latest snapshot  : {latest_snapshot}")
        print()

        for row in changes_df.collect():
            print("=" * 100)
            print(f"Setting : {row['setting_name']}")
            print(f"Change  : {row['change_type']}")

            if row["change_type"] == "Added":
                print(f"Enabled : {row['latest_enabled']}")
                print(f"Groups  : {row['latest_enabled_security_groups']}")

            elif row["change_type"] == "Removed":
                print(f"Previous Enabled : {row['previous_enabled']}")
                print(f"Previous Groups  : {row['previous_enabled_security_groups']}")

            elif row["change_type"] == "Changed":
                print()
                print("Enabled:")
                print(f"  Previous: {row['previous_enabled']}")
                print(f"  Latest  : {row['latest_enabled']}")

                print()
                print("Security Groups:")
                print(f"  Previous: {row['previous_enabled_security_groups']}")
                print(f"  Latest  : {row['latest_enabled_security_groups']}")

The portal does show a banner when Microsoft adds or changes a setting, but it says nothing about changes your own admins made. The drift pattern above catches both, which is the whole point.

Audit Events via the Power BI REST API

Tenant settings show you what can be done. Audit operations show you what is being done, which is the other half of governance information that can’t be provided by settings alone.

The GetActivityEvents Endpoint

Note that although this is a REST API from Power BI and not from Fabric, it’s the source of information on tenant-wide activities. It’s wrapped by sempy.fabric.admin. Non-exhaustive list of operation types:

OperationWhat it tells you
ViewReportWho viewed which report, and when
ExportReportPDF, PowerPoint, and image exports
ExportDataUnderlying dataset data exports
DeleteDatasetDataset deletions
ShareDashboardDashboard sharing events
CreateAppApp creation
AnalyzeInExcelOpen-in-Excel and other data extracts

Once you have collected enough data over a number of weeks, asking questions such as which reports have never been accessed, who exports the most data, and whether anyone accessed this data set before deleting it become answerable questions.

Key Constraints to Know Upfront

Before beginning to write the script, it is best to plan both the loop and the required permissions.

ConstraintDetail
Same UTC day windowstart_time and end_time must fall within the same UTC day. A single call covers up to 24 hours.
Lookback limitEvents older than 28 days aren't available through this endpoint.
Fabric Administrator requiredThe caller must be a Fabric administrator, or authenticate with a service principal.
PaginationLarge result sets return a continuation token; list_activity_events() handles paging internally, so you get back a single complete DataFrame for the window you asked for.
Rate limit200 requests per hour for activity events (compare to 25 requests per minute for tenant settings reads and writes).

You'll see "30 days" quoted constantly in community blog posts and even in some conference decks (mine included, until I checked). The current Admin - Get Activity Events reference on Microsoft Learn says 28 days, not 30. If you're building a retention strategy around this, build in a buffer rather than cutting it close on day 29.

Pulling Events with list_activity_events()

Python
from datetime import datetime, timezone, time
import sempy.fabric.admin as admin

today_utc = datetime.now(timezone.utc).date()
start_time = datetime.combine(today_utc, time.min).strftime("%Y-%m-%dT%H:%M:%S")
end_time = datetime.combine(today_utc, time.max).strftime("%Y-%m-%dT%H:%M:%S")

events = admin.list_activity_events(
    start_time=start_time,
    end_time=end_time,
    activity="ViewReport",  # optional
)
display(events)

Pass in start_time and end_time (they have to be from the same day, UTC-wise, or else the function will raise a ValueError) as well as activity or user_id for filtering purposes, if desired, to avoid getting all the records and filtering them later in pandas.

This is a point of confusion when porting scripts that directly used the raw REST API because, here, the DataFrame columns are in readable form, rather than camelCase. That means df["User Id"] and df["Workspace Name"], not df["UserId"] and df["WorkspaceName"].

Pattern: Lightweight Activity Dashboard

The loop isn't all too difficult if you think of the same-UTC-day limit as our basic unit of work: loop through the days, write those days' worth of data into the lakehouse, create a Direct Lake semantic model layer above that, and report from the latter.

A Data Pipeline kicks off a notebook, executed daily, that performs its queries (just one query per day as we've established from the API's constraints), writes into a Delta table partitioned by date, and is done. Not really a loop at all unless you're doing any backfilling, in which case, you'll have to make a call every single day for your backfilling range because of the above same-day constraint.

From there, just create a Direct Lake semantic model reading from the lakehouse: no imports, no scheduled refreshes required to keep an eye on that. In addition, a few views will do most of the reporting work for you here: reports that show the views you don't have many views on, user exports to identify heavy exporters, activity heat maps per hour to see what times of the day put the most stress on your capacity, and finally a trend line of the past 30 days.

After the data is safely stored away, the next step is to combine the tenant settings snapshot with the activity events data to paint a picture of who changed what and when.

Python
import pandas as pd
from datetime import datetime, timezone, time
from pyspark.sql import functions as F

import sempy.fabric.admin as admin

table_name = "tenant_settings_snapshot"


# ------------------------------------------------------------
# Helpers
# ------------------------------------------------------------

def get_col(row, *names):
    for name in names:
        value = row.get(name, "")
        if value not in [None, ""]:
            return value
    return ""


def clean_col_name(col):
    import re
    col = col.strip()
    col = re.sub(r"[^A-Za-z0-9_]", "_", col)
    col = re.sub(r"_+", "_", col)
    return col.strip("_").lower()


# ------------------------------------------------------------
# 1. Detect changed tenant settings
# ------------------------------------------------------------

df = spark.table(table_name)

snapshot_rows = (
    df.select("snapshot_utc")
      .distinct()
      .orderBy(F.col("snapshot_utc").desc())
      .limit(2)
      .collect()
)

if len(snapshot_rows) < 2:
    print("Not enough snapshots to compare. Need at least two snapshots.")

else:
    latest_snapshot = snapshot_rows[0]["snapshot_utc"]
    previous_snapshot = snapshot_rows[1]["snapshot_utc"]

    latest_df = df.filter(F.col("snapshot_utc") == latest_snapshot).alias("l")
    previous_df = df.filter(F.col("snapshot_utc") == previous_snapshot).alias("p")

    changes_df = (
        latest_df
        .join(previous_df, on="setting_name", how="full_outer")
        .select(
            F.coalesce(F.col("l.setting_name"), F.col("p.setting_name")).alias("setting_name"),

            F.col("p.snapshot_utc").alias("previous_snapshot_utc"),
            F.col("l.snapshot_utc").alias("latest_snapshot_utc"),

            F.col("p.enabled").alias("previous_enabled"),
            F.col("l.enabled").alias("latest_enabled"),

            F.col("p.enabled_security_groups").alias("previous_enabled_security_groups"),
            F.col("l.enabled_security_groups").alias("latest_enabled_security_groups"),

            F.col("p.setting_hash").alias("previous_setting_hash"),
            F.col("l.setting_hash").alias("latest_setting_hash"),

            F.when(F.col("p.setting_name").isNull(), F.lit("Added"))
             .when(F.col("l.setting_name").isNull(), F.lit("Removed"))
             .when(F.col("p.setting_hash") != F.col("l.setting_hash"), F.lit("Changed"))
             .otherwise(F.lit("Unchanged"))
             .alias("change_type")
        )
        .filter(F.col("change_type") != "Unchanged")
        .orderBy("setting_name")
    )

    changes = changes_df.collect()

    if not changes:
        print("No tenant setting changes detected.")
        print(f"Previous snapshot: {previous_snapshot}")
        print(f"Latest snapshot  : {latest_snapshot}")

    else:
        print(f"{len(changes)} tenant setting change(s) detected.")
        print(f"Previous snapshot: {previous_snapshot}")
        print(f"Latest snapshot  : {latest_snapshot}")
        print()


        # ------------------------------------------------------------
        # 2. Get UpdatedAdminFeatureSwitch audit events
        # ------------------------------------------------------------

        previous_dt = datetime.fromisoformat(previous_snapshot.replace("Z", "+00:00"))
        latest_dt = datetime.fromisoformat(latest_snapshot.replace("Z", "+00:00"))

        audit_day = latest_dt.date()

        start_time = datetime.combine(audit_day, time.min).strftime("%Y-%m-%dT%H:%M:%S")
        end_time = datetime.combine(audit_day, time.max).strftime("%Y-%m-%dT%H:%M:%S")

        events = admin.list_activity_events(
            start_time=start_time,
            end_time=end_time
        )

        events_pdf = events if isinstance(events, pd.DataFrame) else pd.DataFrame(events)

        if events_pdf.empty:
            updated_events = pd.DataFrame()
        else:
            events_pdf.columns = [clean_col_name(c) for c in events_pdf.columns]

            updated_events = events_pdf[
                events_pdf.apply(
                    lambda row: row.astype(str).str.contains(
                        "UpdatedAdminFeatureSwitch",
                        case=False,
                        na=False
                    ).any(),
                    axis=1
                )
            ].copy()

            if "creation_time" in updated_events.columns:
                updated_events["creation_time_parsed"] = pd.to_datetime(
                    updated_events["creation_time"],
                    errors="coerce",
                    utc=True
                )

                updated_events = updated_events[
                    (updated_events["creation_time_parsed"] >= previous_dt) &
                    (updated_events["creation_time_parsed"] <= latest_dt)
                ]


        # ------------------------------------------------------------
        # 3. Friendly output
        # ------------------------------------------------------------

        for change in changes:
            print("=" * 120)
            print(f"Tenant setting changed: {change['setting_name']}")
            print(f"Change type           : {change['change_type']}")
            print()

            print("Enabled:")
            print(f"  Previous: {change['previous_enabled']}")
            print(f"  Latest  : {change['latest_enabled']}")
            print()

            print("Security Groups:")
            print(f"  Previous: {change['previous_enabled_security_groups']}")
            print(f"  Latest  : {change['latest_enabled_security_groups']}")
            print()

            if updated_events.empty:
                print("Changed by: No matching UpdatedAdminFeatureSwitch audit event found in the snapshot window.")
            else:
                print("Possible UpdatedAdminFeatureSwitch actor event(s):")

                for _, event in updated_events.iterrows():
                    print("-" * 80)
                    print(f"Time      : {get_col(event, 'creation_time')}")
                    print(f"User      : {get_col(event, 'user_id', 'user_key')}"[:21])
                    print(f"Activity  : {get_col(event, 'activity')}")
                    print(f"Operation : {get_col(event, 'operation')}")
                    print(f"Is Success: {get_col(event, 'is_success')}")
                    print(f"Raw object : {get_col(event, 'object_id', 'object_display_name', 'item_name')}")
            print()

Putting It Together: A Governance Story

State and activity provide answers to two distinct queries, and both are required to obtain a comprehensive understanding.

The list_tenant_settings() call shows you current capabilities, such as enabled services, scoped groups, and even the time since those things were last changed when snapshotting daily. The list_activity_events() call gives you an idea of what folks are up to with their access rights, based on whatever period of time you've retained.

Combined, these give the answer to the important query posed by all governance discussions: not just whether or not the setting is turned on, but who's making use of whatever it has provided access to and whether they've been behaving as expected.

If your next step is to start flipping tenant settings programmatically with update_tenant_setting, know that the underlying Tenants - Update Tenant Setting REST API is still marked preview as of this writing. I'd treat any settings-as-code pipeline built on it as something to watch closely rather than something to fully trust unattended, at least until it graduates to GA.

Key Takeaways

  • sempy.fabric.admin is new (it shipped in semantic-link 0.14.0), so make sure you're on a current package version before reaching for it.
  • list_tenant_settings() and list_activity_events() both return clean, humanized DataFrames with auth handled for you. No FabricRestClient, no manual JSON parsing.
  • Snapshot tenant settings daily to a Delta table if you want config drift detection. A point-in-time read alone gives you no history to diff against.
  • Audit events are constrained to a same-UTC-day window per call and a 28-day lookback (not 30, despite what you'll read elsewhere), and require Fabric Administrator rights or a service principal.
  • Config state and activity history are complementary, not redundant. One tells you what's allowed, the other tells you what's actually happening.

If you want to build either of these patterns, the full notebooks from the session are linked in the Fabric Discord, and I'm happy to answer questions there if you get stuck on the lakehouse or Direct Lake side of things.

References

💬

Join the Community

Connect with fellow Fabric enthusiasts on Discord. Get help, share insights, and stay up to date with the latest discussions.

Join Discord Server

Comments

Loading...
Loading comments...